IBM Machine Learning Specialist Exam
Verified Questions, Correct Answers, and
Detailed Explanations for Computer Science
Students||Already Graded A+
1. Which of the following is a supervised learning algorithm?
A) K-Means Clustering
B) Linear Regression
C) PCA
D) DBSCAN
Rationale: Linear Regression is used in supervised learning where the
model learns from labeled data to predict a continuous target
variable.
2. What is overfitting in machine learning?
A) When a model performs well on test data but poorly on training
data
B) When a model performs well on training data but poorly on
unseen data
C) When a model underperforms due to insufficient training
D) When a model has too few parameters
Rationale: Overfitting occurs when a model captures noise in the
training data, making it less generalizable to new data.
3. Which metric is suitable for evaluating a classification model on
imbalanced datasets?
A) Accuracy
B) F1-Score
,C) Mean Squared Error
D) R²
Rationale: F1-Score balances precision and recall, making it suitable
for imbalanced datasets where accuracy might be misleading.
4. What does the bias-variance tradeoff describe?
A) Tradeoff between learning rate and convergence
B) Tradeoff between model complexity and generalization error
C) Tradeoff between training and testing time
D) Tradeoff between precision and recall
Rationale: A high-bias model underfits, while a high-variance model
overfits; the tradeoff aims to minimize generalization error.
5. Which of the following is a technique to reduce overfitting?
A) Adding more features
B) Regularization
C) Reducing dataset size
D) Increasing model complexity
Rationale: Regularization techniques like L1/L2 penalize large
coefficients, preventing the model from fitting noise.
6. In a neural network, which activation function is commonly used
for binary classification?
A) ReLU
B) Tanh
C) Sigmoid
D) Softmax
,Rationale: Sigmoid outputs values between 0 and 1, suitable for
modeling probabilities in binary classification.
7. Which unsupervised learning algorithm is used for
dimensionality reduction?
A) K-Nearest Neighbors
B) Principal Component Analysis (PCA)
C) Decision Trees
D) Linear Regression
Rationale: PCA reduces the number of features while retaining most
of the variance in the data.
8. What is a confusion matrix?
A) A way to visualize model accuracy only
B) A table showing true vs predicted labels for classification models
C) A method for dimensionality reduction
D) A clustering evaluation metric
Rationale: Confusion matrices provide insight into true positives, false
positives, true negatives, and false negatives.
9. Which technique can handle missing data in a dataset?
A) Ignore the missing data
B) Imputation (mean, median, mode, or predictive)
C) Randomly fill with zeros
D) None of the above
Rationale: Imputation replaces missing values with statistically or
model-predicted values, preventing data loss.
, 10. What is the main difference between bagging and boosting?
A) Bagging is sequential, boosting is parallel
B) Bagging is parallel, boosting is sequential
C) Both are unsupervised methods
D) Bagging increases bias, boosting increases variance
Rationale: Bagging reduces variance by training multiple models in
parallel; boosting reduces bias by sequentially training models.
11. Which evaluation metric is used for regression problems?
A) F1-Score
B) Accuracy
C) Mean Squared Error (MSE)
D) Precision
Rationale: MSE measures the average squared difference between
predicted and actual values, suitable for regression tasks.
12. What does the ROC curve represent?
A) Loss vs Epoch
B) Tradeoff between True Positive Rate and False Positive Rate
C) Accuracy vs Precision
D) Model complexity vs error
Rationale: The ROC curve evaluates the performance of a binary
classifier across various thresholds.
13. Which of the following is a kernel function used in SVM?
A) Linear
B) Polynomial
Verified Questions, Correct Answers, and
Detailed Explanations for Computer Science
Students||Already Graded A+
1. Which of the following is a supervised learning algorithm?
A) K-Means Clustering
B) Linear Regression
C) PCA
D) DBSCAN
Rationale: Linear Regression is used in supervised learning where the
model learns from labeled data to predict a continuous target
variable.
2. What is overfitting in machine learning?
A) When a model performs well on test data but poorly on training
data
B) When a model performs well on training data but poorly on
unseen data
C) When a model underperforms due to insufficient training
D) When a model has too few parameters
Rationale: Overfitting occurs when a model captures noise in the
training data, making it less generalizable to new data.
3. Which metric is suitable for evaluating a classification model on
imbalanced datasets?
A) Accuracy
B) F1-Score
,C) Mean Squared Error
D) R²
Rationale: F1-Score balances precision and recall, making it suitable
for imbalanced datasets where accuracy might be misleading.
4. What does the bias-variance tradeoff describe?
A) Tradeoff between learning rate and convergence
B) Tradeoff between model complexity and generalization error
C) Tradeoff between training and testing time
D) Tradeoff between precision and recall
Rationale: A high-bias model underfits, while a high-variance model
overfits; the tradeoff aims to minimize generalization error.
5. Which of the following is a technique to reduce overfitting?
A) Adding more features
B) Regularization
C) Reducing dataset size
D) Increasing model complexity
Rationale: Regularization techniques like L1/L2 penalize large
coefficients, preventing the model from fitting noise.
6. In a neural network, which activation function is commonly used
for binary classification?
A) ReLU
B) Tanh
C) Sigmoid
D) Softmax
,Rationale: Sigmoid outputs values between 0 and 1, suitable for
modeling probabilities in binary classification.
7. Which unsupervised learning algorithm is used for
dimensionality reduction?
A) K-Nearest Neighbors
B) Principal Component Analysis (PCA)
C) Decision Trees
D) Linear Regression
Rationale: PCA reduces the number of features while retaining most
of the variance in the data.
8. What is a confusion matrix?
A) A way to visualize model accuracy only
B) A table showing true vs predicted labels for classification models
C) A method for dimensionality reduction
D) A clustering evaluation metric
Rationale: Confusion matrices provide insight into true positives, false
positives, true negatives, and false negatives.
9. Which technique can handle missing data in a dataset?
A) Ignore the missing data
B) Imputation (mean, median, mode, or predictive)
C) Randomly fill with zeros
D) None of the above
Rationale: Imputation replaces missing values with statistically or
model-predicted values, preventing data loss.
, 10. What is the main difference between bagging and boosting?
A) Bagging is sequential, boosting is parallel
B) Bagging is parallel, boosting is sequential
C) Both are unsupervised methods
D) Bagging increases bias, boosting increases variance
Rationale: Bagging reduces variance by training multiple models in
parallel; boosting reduces bias by sequentially training models.
11. Which evaluation metric is used for regression problems?
A) F1-Score
B) Accuracy
C) Mean Squared Error (MSE)
D) Precision
Rationale: MSE measures the average squared difference between
predicted and actual values, suitable for regression tasks.
12. What does the ROC curve represent?
A) Loss vs Epoch
B) Tradeoff between True Positive Rate and False Positive Rate
C) Accuracy vs Precision
D) Model complexity vs error
Rationale: The ROC curve evaluates the performance of a binary
classifier across various thresholds.
13. Which of the following is a kernel function used in SVM?
A) Linear
B) Polynomial