AWS Machine Learning Associate: Hands
On!
1.Your company needs to store a massive volume of semi-structured data for future
analysis. Which AWS storage solution would be the most cost-effective and flexible?:
Amazon S3
2.Your application needs to provide low-latency access to frequently ac- cessed
data while minimizing costs. Which S3 storage class should you choose?: S3
Standard
3.Your company uses AWS Lambda to generate thumbnails from images up- loaded
to S3. The thumbnails are only needed temporarily. Which storage class should you
choose for these thumbnails?: S3 One Zone-Infrequent Access
4.A large-scale real-time data processing application needs to analyze data from
IoT devices. Which service should you use to ingest, process, and ana- lyze this data
in near real-time?: Amazon Kinesis Data Streams
5.Your company is implementing a data lake architecture on AWS. They need to
enforce governance and control access to specific data products across multiple
teams. Which approach should they use?: Data Mesh
6.A company needs to encrypt sensitive data at rest stored in Amazon S3. They also
require auditability of key usage. Which encryption option should they choose?:
SSE-KMS
7.Your organization needs to choose a data format for a large-scale ETL pipeline
that processes batch data in Hadoop and Spark. The data should be optimized for
analytics and efficient storage. Which format is most appropri- ate?: Parquet
8.Your application requires high-performance block storage for an EC2 in- stance
that needs to persist data after termination. Which AWS storage solu- tion should
you choose?: Amazon EBS
9.A team is working on a shared development environment and needs a scalable,
highly available file system that can be mounted on multiple EC2 instances across
multiple Availability Zones. Which AWS service should they use?: Amazon EFS
10.During peak usage, your Kinesis Data Streams application is experiencing
throttling errors. Which of the following actions should you take to resolve this issue?:
Increase the number of shards
11.Your data engineering team is setting up a data pipeline to process large datasets
using Hadoop and Spark on AWS. Which service would you choose to efficiently
, AWS Machine Learning Associate: Hands
On!
manage the cluster?: Amazon EMR
12.Your team is building a recommendation system for an e-commerce plat- form.
Which SageMaker feature would help you efficiently handle and scale the training
of your machine learning model?: SageMaker Training Jobs
, AWS Machine Learning Associate: Hands
On!
13.A financial institution wants to detect anomalies in transaction data to identify
potential fraud. Which AWS service could you use to deploy an outlier detection
model, and monitor its performance over time?: SageMaker Model Monitor
14.Your company is developing a machine learning model to predict customer churn,
but you're concerned about potential bias in the model's predictions. Which
SageMaker feature helps detect and mitigate bias during model train- ing?:
SageMaker Clarify
15.You are tasked with deploying a machine learning model to edge devices in a
factory. Which SageMaker feature would you use to optimize and deploy the model
efficiently?
SageMaker Neo: SageMaker Neo
16.You are managing a big data project that requires periodic processing of large
datasets stored in Amazon S3. Which service would you use to schedule and manage
the execution of these data processing jobs?: AWS Glue
17.A healthcare company needs to label large amounts of medical images for
training a diagnostic AI model. Which SageMaker service would streamline this
process by managing the labeling tasks?: SageMaker Ground Truth
18.A data scientist wants to experiment with different machine learning algo- rithms
and compare their performance on a dataset. Which SageMaker feature provides an
environment for such experimentation?: SageMaker Studio
19.You are working with a dataset that contains several missing values in a
column representing customer ages. The data is relatively small, and the
distribution of ages is skewed by a few extreme outliers. Which imputation
technique should you choose to fill in the missing values?: Median Replace- ment
20.Your machine learning model is struggling to accurately predict a rare event,
such as equipment failure, which occurs in less than 1% of your dataset. You want to
generate synthetic samples to balance the dataset without simply duplicating existing
data. Which technique should you use?: SMOTE (Synthetic Minority Oversampling
Technique)
21.A media company wants to automatically generate subtitles for videos in real-
time. Which AWS service is best suited for this task?: Amazon Transcribe
22.Your company needs to perform sentiment analysis on customer feedback
, AWS Machine Learning Associate: Hands
On!
provided in various languages. Which AWS service would best suit this re-
quirement?: Amazon Comprehend
23.A financial institution needs to detect fraudulent transactions in real-time using
historical transaction data. Which AWS service is most appropriate for this use case,
while minimizing development effort?: Amazon Fraud Detector
On!
1.Your company needs to store a massive volume of semi-structured data for future
analysis. Which AWS storage solution would be the most cost-effective and flexible?:
Amazon S3
2.Your application needs to provide low-latency access to frequently ac- cessed
data while minimizing costs. Which S3 storage class should you choose?: S3
Standard
3.Your company uses AWS Lambda to generate thumbnails from images up- loaded
to S3. The thumbnails are only needed temporarily. Which storage class should you
choose for these thumbnails?: S3 One Zone-Infrequent Access
4.A large-scale real-time data processing application needs to analyze data from
IoT devices. Which service should you use to ingest, process, and ana- lyze this data
in near real-time?: Amazon Kinesis Data Streams
5.Your company is implementing a data lake architecture on AWS. They need to
enforce governance and control access to specific data products across multiple
teams. Which approach should they use?: Data Mesh
6.A company needs to encrypt sensitive data at rest stored in Amazon S3. They also
require auditability of key usage. Which encryption option should they choose?:
SSE-KMS
7.Your organization needs to choose a data format for a large-scale ETL pipeline
that processes batch data in Hadoop and Spark. The data should be optimized for
analytics and efficient storage. Which format is most appropri- ate?: Parquet
8.Your application requires high-performance block storage for an EC2 in- stance
that needs to persist data after termination. Which AWS storage solu- tion should
you choose?: Amazon EBS
9.A team is working on a shared development environment and needs a scalable,
highly available file system that can be mounted on multiple EC2 instances across
multiple Availability Zones. Which AWS service should they use?: Amazon EFS
10.During peak usage, your Kinesis Data Streams application is experiencing
throttling errors. Which of the following actions should you take to resolve this issue?:
Increase the number of shards
11.Your data engineering team is setting up a data pipeline to process large datasets
using Hadoop and Spark on AWS. Which service would you choose to efficiently
, AWS Machine Learning Associate: Hands
On!
manage the cluster?: Amazon EMR
12.Your team is building a recommendation system for an e-commerce plat- form.
Which SageMaker feature would help you efficiently handle and scale the training
of your machine learning model?: SageMaker Training Jobs
, AWS Machine Learning Associate: Hands
On!
13.A financial institution wants to detect anomalies in transaction data to identify
potential fraud. Which AWS service could you use to deploy an outlier detection
model, and monitor its performance over time?: SageMaker Model Monitor
14.Your company is developing a machine learning model to predict customer churn,
but you're concerned about potential bias in the model's predictions. Which
SageMaker feature helps detect and mitigate bias during model train- ing?:
SageMaker Clarify
15.You are tasked with deploying a machine learning model to edge devices in a
factory. Which SageMaker feature would you use to optimize and deploy the model
efficiently?
SageMaker Neo: SageMaker Neo
16.You are managing a big data project that requires periodic processing of large
datasets stored in Amazon S3. Which service would you use to schedule and manage
the execution of these data processing jobs?: AWS Glue
17.A healthcare company needs to label large amounts of medical images for
training a diagnostic AI model. Which SageMaker service would streamline this
process by managing the labeling tasks?: SageMaker Ground Truth
18.A data scientist wants to experiment with different machine learning algo- rithms
and compare their performance on a dataset. Which SageMaker feature provides an
environment for such experimentation?: SageMaker Studio
19.You are working with a dataset that contains several missing values in a
column representing customer ages. The data is relatively small, and the
distribution of ages is skewed by a few extreme outliers. Which imputation
technique should you choose to fill in the missing values?: Median Replace- ment
20.Your machine learning model is struggling to accurately predict a rare event,
such as equipment failure, which occurs in less than 1% of your dataset. You want to
generate synthetic samples to balance the dataset without simply duplicating existing
data. Which technique should you use?: SMOTE (Synthetic Minority Oversampling
Technique)
21.A media company wants to automatically generate subtitles for videos in real-
time. Which AWS service is best suited for this task?: Amazon Transcribe
22.Your company needs to perform sentiment analysis on customer feedback
, AWS Machine Learning Associate: Hands
On!
provided in various languages. Which AWS service would best suit this re-
quirement?: Amazon Comprehend
23.A financial institution needs to detect fraudulent transactions in real-time using
historical transaction data. Which AWS service is most appropriate for this use case,
while minimizing development effort?: Amazon Fraud Detector