Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

GCP PROFESSIONAL DATA ENGINEER CERTIFICATION QUESTIONS AND CORRECT VERIFIED ANSWERS 2026/2027

Rating
-
Sold
-
Pages
157
Grade
A+
Uploaded on
13-04-2026
Written in
2025/2026

GCP PROFESSIONAL DATA ENGINEER CERTIFICATION QUESTIONS AND CORRECT VERIFIED ANSWERS 2026/2027

Institution
GCP PROFESSIONAL DATA ENGINEER CERTIFICATION
Course
GCP PROFESSIONAL DATA ENGINEER CERTIFICATION

Content preview

GCP PROFESSIONAL DATA ENGINEER CERTIFICATION
QUESTIONS AND CORRECT VERIFIED ANSWERS
2026/2027
A developer is planning a mobile application for your company's customers to
use to track information about their accounts. The developer is asking for your
advice on storage technologies. In one case, the developer explains that they
want to write messages each time a significant event occurs, such as the client
opening, viewing, or deleting an account. This data is collected for compliance
reasons, and the developer wants to minimize administrative overhead. What
system would you recommend for storing this data?

A. Cloud SQL using MySQL
B. Cloud SQL using PostgreSQL
C. Cloud Datastore
D. Stackdriver Logging
D. The correct answer is D. Stackdriver Logging is the best option because it is a
managed service designed for storing logging data. Neither Option A nor B is as
good a fit because the developer would have to design and maintain a relational
data model and user interface to view and manage log data. Option C, Cloud
Datastore, would not require a fixed data model, but it would still require the
developer to create and maintain a user interface to manage log events.
You are responsible for developing an ingestion mechanism for a large number
of IoT sensors. The ingestion service should accept data up to 10 minutes late.
The service should also perform some transformations before writing the data
to a database. Which of the managed services would be the best option for
managing late arriving data and performing transformations?

A. Cloud Dataproc
B. Cloud Dataflow
C. Cloud Dataprep
D. Cloud SQL
B. The correct answer is B. Cloud Dataflow is a stream and batch processing
service that is used for transforming data and processing streaming data. Option
A, Cloud Dataproc, is a managed Hadoop and Spark service and not as well suited

,as Cloud Dataflow for the kind of stream processing specified. Option C, Cloud
Dataprep, is an interactive tool for exploring and preparing data sets for analysis.
Option D, Cloud SQL, is a relational database service, so it may be used to store
data, but it is not a service specifically for ingesting and transforming data before
writing to a database.
A team of analysts has collected several CSV datasets with a total size of 50 GB.
They plan to store the datasets in GCP and use Compute Engine instances to run
RStudio, an interactive stats application. Data will be loaded into RStudio using
an RStudio data loading tool. Which of the following is the most appropriate
GCP storage service for the datasets?

A. Cloud Storage
B. Cloud Datastore
C. MongoDB
D. Bigtable
A. The correct answer is A, Cloud Storage, because the data in the files is treated
as an atomic unit of data that is loaded into RStudio. Options B and C are
incorrect because those are document databases and there is no requirement for
storing the data in semistructured format with support for fully indexed querying.
Also, MongoDB is not a GCP service. Option D is incorrect because, although you
could load CSV data into a Bigtable table, the volume of data is not sufficient to
warrant using Bigtable.
A team of analysts has collected several terabytes of telemetry data in CSV
datasets. They plan to store the datasets in GCP and query and analyze the data
using SQL. Which of the following is the most appropriate GCP storage service
for the datasets?

A. Cloud SQL
B. Cloud Spanner
C. BigQuery
D. Bigtable
C. The correct answer is C, BigQuery, which is a managed analytical database
service that supports SQL and scales to petabyte volumes of data. Options A and B

,are incorrect because both are used for transaction processing applications, not
analytics. Option D is incorrect because Bigtable does not support SQL.
You have been hired to consult with a startup that is developing software for
self-driving vehicles. The company's product uses machine learning to predict
the trajectory of persons and vehicles. Currently, the software is being
developed using 20 vehicles, all located in the same city. IoT data is sent from
vehicles every 60 seconds to a MySQL database running on a Compute Engine
instance using an n2-standard-8 machine type with 8 vCPUs and 16 GB of
memory. The startup wants to review their architecture and make any
necessary changes to support tens of thousands of self-driving vehicles, all
transmitting IoT data every second. The vehicles will be located across North
America and Europe. Approximately 4 KB of data is sent in each transmission.
What changes to the architecture would you recommend?

A. None. The current architecture is well suited to the use case.
B. Replace Cloud SQL with Cloud Spanner.
C. Replace Cloud SQL with Bigtable.
D. Replace Cloud SQL with Cloud Datastore.
C. The correct answer is C. Bigtable is the best storage service for IoT data,
especially when a large number of devices will be sending data at short intervals.
Option A is incorrect, because Cloud SQL is designed for transaction processing at
a regional level. Option B is incorrect because Cloud Spanner is designed for
transaction processing, and although it scales to global levels, it is not the best
option for IoT data. Option D is incorrect because there is no need for indexed,
semi-structured data.
As a member of a team of game developers, you have been tasked with devising
a way to track players' possessions. Possessions may be purchased from a
catalog, traded with other players, or awarded for game activities. Possessions
are categorized as clothing, tools, books, and coins. Players may have any
number of possessions of any type. Players can search for other players who
have particular possession types to facilitate trading. The game designer has
informed you that there will likely be new types of possessions and ways to
acquire them in the future. What kind of a data store would you recommend

, using?

A. Transactional database
B. Wide-column database
C. Document database
D. Analytic database
C. The correct answer is C because the requirements call for a semi-structured
schema. You will need to search players' possessions and not just look them up
using a single key because of the requirement for facilitating trading. Option A is
not correct. Transactional databases have fixed schemas, and this use case calls
for a semi-structured schema. Option B is incorrect because it does not support
indexed lookup, which is needed for searching. Option D is incorrect. Analytical
databases are structured data stores.
The CTO of your company wants to reduce the cost of running an HBase and
Hadoop cluster on premises. Only one HBase application is run on the cluster.
The cluster currently supports 10 TB of data, but it is expected to double in the
next six months. Which of the
following managed services would you recommend to replace the on-premises
cluster in order to minimize migration and ongoing operational costs?

A. Cloud Bigtable using the HBase API
B. Cloud Dataflow using the HBase API
C. Cloud Spanner
D. Cloud Datastore
A. The correct answer is A. Cloud Bigtable using the HBase API would minimize
migration efforts, and since Bigtable is a managed service, it would help reduce
operational costs. Option B is incorrect. Cloud Dataflow is a stream and batch
processing service, not a database. Options C and D are incorrect. Relational
databases are not likely to be appropriate choices for an HBase database, which is
a wide-column NoSQL database, and trying to migrate from a wide-column to a
relational database would incur unnecessary costs.
A genomics research institute is developing a platform for analyzing data
related to genetic diseases. The genomics data is in a specialized format known

Written for

Institution
GCP PROFESSIONAL DATA ENGINEER CERTIFICATION
Course
GCP PROFESSIONAL DATA ENGINEER CERTIFICATION

Document information

Uploaded on
April 13, 2026
Number of pages
157
Written in
2025/2026
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$25.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
becciedgar26
5.0
(1)

Get to know the seller

Seller avatar
becciedgar26 Teachme2-tutor
Follow You need to be logged in order to follow users or courses
Sold
2
Member since
1 year
Number of followers
0
Documents
704
Last sold
4 hours ago

5.0

1 reviews

5
1
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions