Tentamen (uitwerkingen)

GCP Professional Data Engineer Exam Questions With All Correct Answers

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

29-08-2025

Geschreven in

2025/2026

GCP Professional Data Engineer Exam Questions With All Correct Answers /. 1. You are developing an application that will only recognize and tag specific business to business product logos in images. You do not have an extensive background working with machine learning models, but need to get your application working. What is the current best method to accomplish this task? - Answer-a. Use the AutoML Vision service to train a custom model using the Vision API i. The newly added AutoML services allow you to train custom image (and other models) using the Google's pre-trained API's as a base. Training a custom model also works on AI Platform, but this route requires less manual model overhead. /.2. Your organization is streaming telemetry data into BigQuery for long-term storage (2 years) and analysis, at the rate of about 100 million records per day. They need to be able to run queries against certain time periods of data without incurring the costs of querying all available records. What is the preferred method for doing so? - Answer-a. Partition a single table by day, and run queries against individual partitions. i. Partitioning a single table by date allows you to maintain a single table, but also be able to run queries on a smaller portion of it. While using wildcards across multiple tables (one for each day) technically works, partitioning a single table is best practice. /.3. You are an administrator for several organizations in the same company. Each organization has data in their own BigQuery table within a single project. For application access reasons, all of the tables must remain in the same project. You think each organization should be able to view and run queries against their own data without exposing the data of organizations to unauthorized viewers. What should you recommend? - Answer-a. Create a separate dataset for each organization in the same project. Place each organization's table in each dataset. Restrict access to the organization's dataset to only that company, from which they can view their table but no one else's. i. You can assign roles at the dataset level. Placing tables in different datasets allows you to limit access per dataset. /.4. Your company is making the move to Google Cloud and has chosen to use a managed database service to reduce overhead. Your existing database is used for a product catalog that provides real-time inventory tracking for a retailer. Your database is 500 GB in size. The data is semi-structured and does not need full atomicity. You are looking for a truly no-ops/serverless solution. What storage option should you choose? - Answer-a. Cloud Datastore i. Datastore is perfect for semi-structured data less than 1TB in size. Product catalogs are a recommended use case. /.5. How can you set up your Dataproc environment to use BigQuery as an input and output source? - Answer-a. Install the BigQuery connector on your Dataproc cluster. i. You can install the BigQuery connector to your cluster for direct programmatic read/write access to BigQuery. Note that a Cloud Storage bucket is used between the two services, but you'll interact directly with BigQuery from Dataproc. /.6. In AI Platform, what does the CUSTOM tier allow you to configure? Choose the best answer. - Answer-a. Custom number of workers and parameter servers. Machine type of master server i. Correct. You can customize the number of workers and parameter servers, but masters are set to one. /.7. You are building a data pipeline on Google Cloud. You need to prepare source data for a machine-learning model. This involves quickly deduplicating rows from three input tables and also removing outliers from data columns where you do not know the data distribution. What should you do? - Answer-a. Use Cloud Dataprep to preview the data distributions in sample source data table columns. Click on each column name, click on each appropriate suggested transformation, and then click Add to add each transformation to the Cloud Dataprep job. i. Dataprep is the correct choice because of the requirements to prepare/clean source data. For deduplication, using the suggestion transformation would be easier and quicker than writing a recipe, which is more work than needed. /.8. As part of your backup plan, you create regular boot-disk snapshots of Compute Engine instances that are running. You want to be able to restore these snapshots using the fewest possible steps for replacement instances. What should you do? - Answer-a. Use the snapshots to create replacement instances as needed. i. Snapshots let you recreate instances in the fewest steps. /.9. You are setting up multiple MySQL databases on Compute Engine. You need to collect logs from your MySQL applications for audit purposes. How should you approach this? - Answer-a. Install the Stackdriver Logging agent on your database instances and configure the fluentd plugin to read and export your MySQL logs into Stackdriver Logging. i. The Stackdriver Logging agent requires the fluentd plugin to be configured to read logs from your database application. /.10. Which of these statements do not apply to preemptible worker nodes on Cloud Dataproc? Choose two answers. - Answer-a. You must have a max of 2:1 ratio of preemptible to standard workers. i. There is no ratio requirement, but be aware that preemptible workers can be reclaimed at any time, and you will want a number of standard workers that are always persistent. b. Your cluster can be created with only preemptible workers i. You must have at least one standard worker in a cluster.

Meer zien Lees minder

Instelling

GCP Professional Data Engineer

Vak

GCP Professional Data Engineer

Voorbeeld van de inhoud

GCP Professional Data Engineer Exam
Questions With All Correct Answers

/. 1. You are developing an application that will only recognize and tag specific business
to business product logos in images. You do not have an extensive background working
with machine learning models, but need to get your application working. What is the
current best method to accomplish this task? - Answer-a. Use the AutoML Vision
service to train a custom model using the Vision API
i. The newly added AutoML services allow you to train custom image (and other
models) using the Google's pre-trained API's as a base. Training a custom model also
works on AI Platform, but this route requires less manual model overhead.

/.2. Your organization is streaming telemetry data into BigQuery for long-term storage (2
years) and analysis, at the rate of about 100 million records per day. They need to be
able to run queries against certain time periods of data without incurring the costs of
querying all available records. What is the preferred method for doing so? - Answer-a.
Partition a single table by day, and run queries against individual partitions.
i. Partitioning a single table by date allows you to maintain a single table, but also be
able to run queries on a smaller portion of it. While using wildcards across multiple
tables (one for each day) technically works, partitioning a single table is best practice.

/.3. You are an administrator for several organizations in the same company. Each
organization has data in their own BigQuery table within a single project. For application
access reasons, all of the tables must remain in the same project. You think each
organization should be able to view and run queries against their own data without
exposing the data of organizations to unauthorized viewers. What should you
recommend? - Answer-a. Create a separate dataset for each organization in the same
project. Place each organization's table in each dataset. Restrict access to the
organization's dataset to only that company, from which they can view their table but no
one else's.
i. You can assign roles at the dataset level. Placing tables in different datasets allows
you to limit access per dataset.

/.4. Your company is making the move to Google Cloud and has chosen to use a
managed database service to reduce overhead. Your existing database is used for a
product catalog that provides real-time inventory tracking for a retailer. Your database is
500 GB in size. The data is semi-structured and does not need full atomicity. You are
looking for a truly no-ops/serverless solution. What storage option should you choose? -
Answer-a. Cloud Datastore
i. Datastore is perfect for semi-structured data less than 1TB in size. Product catalogs
are a recommended use case.

, /.5. How can you set up your Dataproc environment to use BigQuery as an input and
output source? - Answer-a. Install the BigQuery connector on your Dataproc cluster.
i. You can install the BigQuery connector to your cluster for direct programmatic
read/write access to BigQuery. Note that a Cloud Storage bucket is used between the
two services, but you'll interact directly with BigQuery from Dataproc.

/.6. In AI Platform, what does the CUSTOM tier allow you to configure? Choose the best
answer. - Answer-a. Custom number of workers and parameter servers. Machine type
of master server
i. Correct. You can customize the number of workers and parameter servers, but
masters are set to one.

/.7. You are building a data pipeline on Google Cloud. You need to prepare source data
for a machine-learning model. This involves quickly deduplicating rows from three input
tables and also removing outliers from data columns where you do not know the data
distribution. What should you do? - Answer-a. Use Cloud Dataprep to preview the data
distributions in sample source data table columns. Click on each column name, click on
each appropriate suggested transformation, and then click Add to add each
transformation to the Cloud Dataprep job.
i. Dataprep is the correct choice because of the requirements to prepare/clean source
data. For deduplication, using the suggestion transformation would be easier and
quicker than writing a recipe, which is more work than needed.

/.8. As part of your backup plan, you create regular boot-disk snapshots of Compute
Engine instances that are running. You want to be able to restore these snapshots
using the fewest possible steps for replacement instances. What should you do? -
Answer-a. Use the snapshots to create replacement instances as needed.
i. Snapshots let you recreate instances in the fewest steps.

/.9. You are setting up multiple MySQL databases on Compute Engine. You need to
collect logs from your MySQL applications for audit purposes. How should you
approach this? - Answer-a. Install the Stackdriver Logging agent on your database
instances and configure the fluentd plugin to read and export your MySQL logs into
Stackdriver Logging.
i. The Stackdriver Logging agent requires the fluentd plugin to be configured to read
logs from your database application.

/.10. Which of these statements do not apply to preemptible worker nodes on Cloud
Dataproc? Choose two answers. - Answer-a. You must have a max of 2:1 ratio of
preemptible to standard workers.
i. There is no ratio requirement, but be aware that preemptible workers can be
reclaimed at any time, and you will want a number of standard workers that are always
persistent.
b. Your cluster can be created with only preemptible workers
i. You must have at least one standard worker in a cluster.

Meld schending auteursrecht

Geschreven voor

Instelling: GCP Professional Data Engineer
Vak: GCP Professional Data Engineer

Documentinformatie

Geüpload op: 29 augustus 2025
Aantal pagina's: 11
Geschreven in: 2025/2026
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

gcp professional data engineer
gcp professional data engineer exam questions
questions with all correct answers

$12.99

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

kartelodoc

3.3

(25)

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

kartelodoc Harvard University

Bekijk profiel

Volgen

Verkocht

143

Lid sinds

1 jaar

Aantal volgers

Documenten

8350

Laatst verkocht

2 weken geleden

Our store offers a wide selection of materials on various subjects and difficulty levels, created by experienced teachers. We specialize on NURSING,WGU,ACLS USMLE,TNCC,PMHNP,ATI and other major courses, Updated Exam, Study Guides and Test banks. If you don't find any document you are looking for in this store contact us and we will fetch it for you in minutes, we love impressing our clients with our quality work and we are very punctual on deadlines. Please go through the sets description appropriately before any purchase and leave a review after purchasing so as to make sure our customers are 100% satisfied. I WISH YOU SUCCESS IN YOUR EDUCATION JOURNEY

Lees meer Lees minder

3.3

25 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper kartelodoc. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $12.99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 47428 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

GCP Professional Data Engineer Exam Questions With All Correct Answers

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?