PALANTIR DATA ENGINEERING CERTIFICATION
EXAM QUESTIONS AND ANSWERS 2025
A developer is planning a mobile application for your
company's customers to use to track information about
their accounts. The developer is asking for your advice
on storage technologies. In one case, the developer
explains that they want to write messages each time a
significant event occurs, such as the client opening,
viewing, or deleting an account. This data is collected
for compliance reasons, and the developer wants to
minimize administrative overhead. What system would
you recommend for storing this data?
A. Cloud SQL using MySQL
B. Cloud SQL using PostgreSQL
C. Cloud Datastore
D. Stackdriver Logging - .....ANSWER ...✔✔ D. The
correct answer is D. Stackdriver Logging is the best
option because it is a managed service designed for
storing logging data. Neither Option A nor B is as good
a fit because the developer would have to design and
maintain a relational data model and user interface to
view and manage log data. Option C, Cloud Datastore,
,2|Page
would not require a fixed data model, but it would still
require the developer to create and maintain a user
interface to manage log events.
You are responsible for developing an ingestion
mechanism for a large number of IoT sensors. The
ingestion service should accept data up to 10 minutes
late. The service should also perform some
transformations before writing the data to a database.
Which of the managed services would be the best
option for managing late arriving data and performing
transformations?
A. Cloud Dataproc
B. Cloud Dataflow
C. Cloud Dataprep
D. Cloud SQL - .....ANSWER ...✔✔ B. The correct
answer is B. Cloud Dataflow is a stream and batch
processing service that is used for transforming data
and processing streaming data. Option A, Cloud
Dataproc, is a managed Hadoop and Spark service and
not as well suited as Cloud Dataflow for the kind of
stream processing specified. Option C, Cloud Dataprep,
is an interactive tool for exploring and preparing data
sets for analysis. Option D, Cloud SQL, is a relational
database service, so it may be used to store data, but it
,3|Page
is not a service specifically for ingesting and
transforming data before writing to a database.
A team of analysts has collected several CSV datasets
with a total size of 50 GB. They plan to store the
datasets in GCP and use Compute Engine instances to
run RStudio, an interactive stats application. Data will be
loaded into RStudio using an RStudio data loading tool.
Which of the following is the most appropriate GCP
storage service for the datasets?
A. Cloud Storage
B. Cloud Datastore
C. MongoDB
D. Bigtable - .....ANSWER ...✔✔ A. The correct
answer is A, Cloud Storage, because the data in the files
is treated as an atomic unit of data that is loaded into
RStudio. Options B and C are incorrect because those
are document databases and there is no requirement for
storing the data in semistructured format with support
for fully indexed querying. Also, MongoDB is not a GCP
service. Option D is incorrect because, although you
could load CSV data into a Bigtable table, the volume
of data is not sufficient to warrant using Bigtable.
, 4|Page
A team of analysts has collected several terabytes of
telemetry data in CSV datasets. They plan to store the
datasets in GCP and query and analyze the data using
SQL. Which of the following is the most appropriate
GCP storage service for the datasets?
A. Cloud SQL
B. Cloud Spanner
C. BigQuery
D. Bigtable - .....ANSWER ...✔✔ C. The correct
answer is C, BigQuery, which is a managed analytical
database service that supports SQL and scales to
petabyte volumes of data. Options A and B are
incorrect because both are used for transaction
processing applications, not analytics. Option D is
incorrect because Bigtable does not support SQL.
You have been hired to consult with a startup that is
developing software for self-driving vehicles. The
company's product uses machine learning to predict the
trajectory of persons and vehicles. Currently, the
software is being developed using 20 vehicles, all
located in the same city. IoT data is sent from vehicles
every 60 seconds to a MySQL database running on a
Compute Engine instance using an n2-standard-8
machine type with 8 vCPUs and 16 GB of memory. The
startup wants to review their architecture and make any