CSC216: INTRODUCTION TO DATA SCIENCE
1. Data Science can be defined as:
A. The study of computer hardware and software systems.
B. The process of extracting insights from structured and unstructured data using statistics,
programming, and domain knowledge.
C. A field focused only on data storage and database management.
D. The development of artificial intelligence without the use of data.
ANSWER: B
2. Which of the following best describes the relationship between Data Science, Machine
Learning, and AI?
A. Data Science is a subset of AI, and Machine Learning is unrelated.
B. Machine Learning is a subset of AI, and Data Science uses Machine Learning for predictive
analytics.
C. AI, Machine Learning, and Data Science are entirely separate fields with no overlap.
D. Data Science and AI are the same, while Machine Learning is a different discipline.
ANSWER: B
3. Which of the following is NOT a typical stage in the Data Science process?
A. Data Collection
B. Model Deployment
C. Hardware Manufacturing
D. Data Cleaning
ANSWER: C
4. Data Science is applied in which of the following domains?
A. Healthcare, Finance, Marketing
B. Only Software Engineering
C. Only Academic Research
D. Only Government Agencies
ANSWER: A
5. Which skill set is most essential for a Data Scientist?
A. Proficiency in electrical engineering and circuit design.
B. Knowledge of statistics, programming, and domain expertise.
C. Expertise in graphic design and animation.
D. Mastery of mechanical engineering principles.
ANSWER: B
6. Which of the following is NOT commonly used as a data analysis tool in Data Science?
A. R
B. Python
C. SAS
D. AutoCAD
ANSWER: D
, 7. Which Python library is primarily used for numerical operations and array processing?
A. Pandas
B. Matplotlib
C. NumPy
D. Scikit-learn
ANSWER: C
8. For data visualization, which tool is widely used for creating interactive dashboards?
A. Tableau
B. Hadoop
C. SQL Server
D. Apache Spark
ANSWER: A
9. Which of the following is a machine learning framework commonly used in Data
Science?
A. TensorFlow
B. Microsoft Word
C. Adobe Photoshop
D. Oracle Database
ANSWER: A
10. Which of the following cloud platforms is frequently used for big data processing and
machine learning services?
A. AWS
B. Photoshop Express
C. Microsoft Paint
D. Google Drive (basic)
ANSWER: A
11. In the context of data science toolkits, which of the following pairs correctly matches a
tool with its primary function?
A. KNIME – Data visualization only
B. RapidMiner – Exclusive to statistical hypothesis testing
C. Apache Spark – Distributed data processing and machine learning
D. Tableau Public – Backend database management
ANSWER: C
12. Which combination of Python libraries is essential for a complete data science workflow
covering data manipulation, visualization, and machine learning?
A. Pandas, Seaborn, Scikit-learn
B. NumPy, TensorFlow, PyTorch
C. Matplotlib, SQLAlchemy, Keras
D. OpenCV, NLTK, SpaCy
ANSWER: A
13. When selecting a big data tool for real-time stream processing, which of the following is
most appropriate?
A. Hadoop MapReduce
1. Data Science can be defined as:
A. The study of computer hardware and software systems.
B. The process of extracting insights from structured and unstructured data using statistics,
programming, and domain knowledge.
C. A field focused only on data storage and database management.
D. The development of artificial intelligence without the use of data.
ANSWER: B
2. Which of the following best describes the relationship between Data Science, Machine
Learning, and AI?
A. Data Science is a subset of AI, and Machine Learning is unrelated.
B. Machine Learning is a subset of AI, and Data Science uses Machine Learning for predictive
analytics.
C. AI, Machine Learning, and Data Science are entirely separate fields with no overlap.
D. Data Science and AI are the same, while Machine Learning is a different discipline.
ANSWER: B
3. Which of the following is NOT a typical stage in the Data Science process?
A. Data Collection
B. Model Deployment
C. Hardware Manufacturing
D. Data Cleaning
ANSWER: C
4. Data Science is applied in which of the following domains?
A. Healthcare, Finance, Marketing
B. Only Software Engineering
C. Only Academic Research
D. Only Government Agencies
ANSWER: A
5. Which skill set is most essential for a Data Scientist?
A. Proficiency in electrical engineering and circuit design.
B. Knowledge of statistics, programming, and domain expertise.
C. Expertise in graphic design and animation.
D. Mastery of mechanical engineering principles.
ANSWER: B
6. Which of the following is NOT commonly used as a data analysis tool in Data Science?
A. R
B. Python
C. SAS
D. AutoCAD
ANSWER: D
, 7. Which Python library is primarily used for numerical operations and array processing?
A. Pandas
B. Matplotlib
C. NumPy
D. Scikit-learn
ANSWER: C
8. For data visualization, which tool is widely used for creating interactive dashboards?
A. Tableau
B. Hadoop
C. SQL Server
D. Apache Spark
ANSWER: A
9. Which of the following is a machine learning framework commonly used in Data
Science?
A. TensorFlow
B. Microsoft Word
C. Adobe Photoshop
D. Oracle Database
ANSWER: A
10. Which of the following cloud platforms is frequently used for big data processing and
machine learning services?
A. AWS
B. Photoshop Express
C. Microsoft Paint
D. Google Drive (basic)
ANSWER: A
11. In the context of data science toolkits, which of the following pairs correctly matches a
tool with its primary function?
A. KNIME – Data visualization only
B. RapidMiner – Exclusive to statistical hypothesis testing
C. Apache Spark – Distributed data processing and machine learning
D. Tableau Public – Backend database management
ANSWER: C
12. Which combination of Python libraries is essential for a complete data science workflow
covering data manipulation, visualization, and machine learning?
A. Pandas, Seaborn, Scikit-learn
B. NumPy, TensorFlow, PyTorch
C. Matplotlib, SQLAlchemy, Keras
D. OpenCV, NLTK, SpaCy
ANSWER: A
13. When selecting a big data tool for real-time stream processing, which of the following is
most appropriate?
A. Hadoop MapReduce