GCP Professional Data Engineer Certification
Exam Newest 2025/2026 With Complete Questions
And Correct Answers |Already Graded A+||Brand
New Version!|
1. You are developing an application that will only
recognize and tag specific business to business product
logos in images. You do not have an extensive
background working with machine learning models, but
need to get your application working. What is the
current best method to accomplish this task? -
...ANSWER...✔✔a. Use the AutoML Vision service to
train a custom model using the Vision API
i. The newly added AutoML services allow you to train
custom image (and other models) using the Google's
pre-trained API's as a base. Training a custom model
also works on AI Platform, but this route requires less
manual model overhead.
2. Your organization is streaming telemetry data into
BigQuery for long-term storage (2 years) and analysis,
at the rate of about 100 million records per day. They
need to be able to run queries against certain time
periods of data without incurring the costs of querying
all available records. What is the preferred method for
,2|Page
doing so? - ...ANSWER...✔✔a. Partition a single table
by day, and run queries against individual partitions.
i. Partitioning a single table by date allows you to
maintain a single table, but also be able to run queries
on a smaller portion of it. While using wildcards across
multiple tables (one for each day) technically works,
partitioning a single table is best practice.
3. You are an administrator for several organizations in
the same company. Each organization has data in their
own BigQuery table within a single project. For
application access reasons, all of the tables must
remain in the same project. You think each organization
should be able to view and run queries against their
own data without exposing the data of organizations to
unauthorized viewers. What should you recommend? -
...ANSWER...✔✔a. Create a separate dataset for each
organization in the same project. Place each
organization's table in each dataset. Restrict access to
the organization's dataset to only that company, from
which they can view their table but no one else's.
i. You can assign roles at the dataset level. Placing
tables in different datasets allows you to limit access
per dataset.
4. Your company is making the move to Google Cloud
and has chosen to use a managed database service to
, 3|Page
reduce overhead. Your existing database is used for a
product catalog that provides real-time inventory
tracking for a retailer. Your database is 500 GB in size.
The data is semi-structured and does not need full
atomicity. You are looking for a truly no-ops/serverless
solution. What storage option should you choose? -
...ANSWER...✔✔a. Cloud Datastore
i. Datastore is perfect for semi-structured data less
than 1TB in size. Product catalogs are a recommended
use case.
5. How can you set up your Dataproc environment to
use BigQuery as an input and output source? -
...ANSWER...✔✔a. Install the BigQuery connector on
your Dataproc cluster.
i. You can install the BigQuery connector to your cluster
for direct programmatic read/write access to BigQuery.
Note that a Cloud Storage bucket is used between the
two services, but you'll interact directly with BigQuery
from Dataproc.
6. In AI Platform, what does the CUSTOM tier allow you
to configure? Choose the best answer. -
...ANSWER...✔✔a. Custom number of workers and
parameter servers. Machine type of master server
i. Correct. You can customize the number of workers
and parameter servers, but masters are set to one.