Complete Course Notes, Data Analytics Concepts, Practice
Questions & Verified Answers for Western Governors
University D491 | Data Analysis Fundamentals, Business
Intelligence & Analytical Decision-Making
Question 1: Which type of analytics focuses on summarizing historical data to understand
what has happened in a business context?
A. Predictive analytics
B. Prescriptive analytics
C. Descriptive analytics
D. Diagnostic analytics
CORRECT ANSWER: C. Descriptive analytics
Rationale: Descriptive analytics involves the interpretation of historical data to identify
patterns, trends, and insights about past performance. It answers the question "What
happened?" through techniques like reporting, dashboards, and data aggregation, forming the
foundation for more advanced analytical approaches.
Question 2: In the analytics lifecycle, which phase involves defining the business problem and
establishing measurable objectives?
A. Data collection
B. Problem framing
C. Model deployment
D. Results communication
CORRECT ANSWER: B. Problem framing
Rationale: Problem framing is the critical initial phase where analysts collaborate with
stakeholders to clearly articulate the business question, define success metrics, and establish
scope. This ensures that subsequent analytical efforts remain aligned with organizational goals
and deliver actionable insights.
Question 3: Which data type represents categories without any intrinsic ordering, such as
product colors or customer regions?
A. Ordinal data
B. Interval data
C. Nominal data
D. Ratio data
CORRECT ANSWER: C. Nominal data
Rationale: Nominal data consists of categories or labels with no meaningful order or ranking.
Examples include gender, country of origin, or product SKU codes. Statistical analyses for
nominal data typically involve frequency counts and chi-square tests rather than mathematical
operations.
Question 4: What is the primary purpose of data governance in an analytics initiative?
A. To increase data storage capacity
B. To ensure data quality, security, and compliance throughout its lifecycle
,C. To accelerate data processing speeds
D. To reduce the cost of analytics software licenses
CORRECT ANSWER: B. To ensure data quality, security, and compliance throughout its
lifecycle
Rationale: Data governance establishes policies, standards, and procedures to manage data
assets responsibly. It addresses data accuracy, accessibility, privacy regulations (like GDPR), and
ethical use, which are essential for building trust in analytical outputs and maintaining
regulatory compliance.
Question 5: Which measure of central tendency is most appropriate when analyzing a dataset
with significant outliers?
A. Mean
B. Median
C. Mode
D. Standard deviation
CORRECT ANSWER: B. Median
Rationale: The median represents the middle value when data is ordered, making it resistant to
extreme values. Unlike the mean, which can be skewed by outliers, the median provides a more
robust measure of central tendency for skewed distributions or datasets with anomalous
observations.
Question 6: In data visualization, which principle emphasizes minimizing non-essential
graphical elements to improve clarity?
A. Chart junk reduction
B. Data-ink ratio optimization
C. Color theory application
D. Interactive element integration
CORRECT ANSWER: B. Data-ink ratio optimization
Rationale: Coined by Edward Tufte, the data-ink ratio principle advocates maximizing the
proportion of ink used to display actual data versus decorative or redundant elements. This
enhances viewer comprehension by reducing cognitive load and focusing attention on
meaningful information.
Question 7: Which SQL clause is used to filter records after a GROUP BY operation has been
applied?
A. WHERE
B. HAVING
C. FILTER
D. LIMIT
CORRECT ANSWER: B. HAVING
Rationale: The HAVING clause filters aggregated results after GROUP BY, whereas WHERE filters
individual rows before aggregation. For example, HAVING COUNT(*) > 10 selects groups with
more than 10 records, which cannot be accomplished with WHERE alone.
Question 8: What is the primary ethical concern when using customer data for predictive
analytics without explicit consent?
,A. Increased computational costs
B. Violation of privacy and autonomy
C. Reduced model accuracy
D. Longer processing times
CORRECT ANSWER: B. Violation of privacy and autonomy
Rationale: Ethical analytics requires respecting individuals' rights to control their personal
information. Using data without informed consent can erode trust, violate regulations like CCPA
or GDPR, and potentially cause harm through discriminatory outcomes or unauthorized
profiling.
Question 9: Which sampling method ensures every member of a population has an equal
probability of being selected?
A. Convenience sampling
B. Stratified sampling
C. Simple random sampling
D. Quota sampling
CORRECT ANSWER: C. Simple random sampling
Rationale: Simple random sampling gives each population element an identical chance of
inclusion, minimizing selection bias and enabling valid statistical inference. This method forms
the basis for many probability-based analytical techniques and hypothesis tests.
Question 10: In hypothesis testing, what does a p-value less than 0.05 typically indicate?
A. The null hypothesis is definitely true
B. There is strong evidence against the null hypothesis
C. The alternative hypothesis is proven
D. The sample size is insufficient
CORRECT ANSWER: B. There is strong evidence against the null hypothesis
Rationale: A p-value below the significance threshold (commonly α=0.05) suggests the
observed data would be unlikely if the null hypothesis were true. This provides statistical
evidence to reject the null hypothesis, though it does not prove the alternative hypothesis or
establish practical significance.
Question 11: Which analytics tool is primarily designed for interactive data exploration and
dashboard creation rather than statistical modeling?
A. R
B. Python with scikit-learn
C. Tableau
D. SAS
CORRECT ANSWER: C. Tableau
Rationale: Tableau specializes in visual analytics, enabling users to create interactive
dashboards and perform drag-and-drop data exploration with minimal coding. While R, Python,
and SAS offer robust statistical capabilities, Tableau excels at intuitive visualization and business
user accessibility.
Question 12: What is the main advantage of using a prescriptive analytics approach over a
predictive one?
, A. It requires less historical data
B. It recommends specific actions to achieve desired outcomes
C. It is easier to implement technically
D. It eliminates the need for human judgment
CORRECT ANSWER: B. It recommends specific actions to achieve desired outcomes
Rationale: Prescriptive analytics goes beyond forecasting by using optimization, simulation, or
decision analysis to suggest actionable strategies. While predictive analytics answers "What will
happen?", prescriptive analytics addresses "What should we do about it?" to drive better
business decisions.
Question 13: Which data quality dimension refers to the accuracy and correctness of data
values relative to real-world entities?
A. Timeliness
B. Completeness
C. Validity
D. Accuracy
CORRECT ANSWER: D. Accuracy
Rationale: Accuracy measures how closely data values reflect the true state of the entities they
represent. For instance, a customer's recorded age matching their actual age demonstrates
high accuracy, which is critical for reliable analytical outcomes and decision-making.
Question 14: In a correlation analysis, what does a coefficient of -0.85 indicate about the
relationship between two variables?
A. Strong positive linear relationship
B. Weak negative linear relationship
C. Strong negative linear relationship
D. No linear relationship
CORRECT ANSWER: C. Strong negative linear relationship
Rationale: Correlation coefficients range from -1 to +1. A value of -0.85 indicates a strong
inverse linear relationship: as one variable increases, the other tends to decrease substantially.
However, correlation does not imply causation, and non-linear relationships may exist
undetected.
Question 15: Which component of the CRISP-DM methodology focuses on preparing raw data
for modeling through cleaning, transformation, and feature engineering?
A. Business understanding
B. Data understanding
C. Data preparation
D. Modeling
CORRECT ANSWER: C. Data preparation
Rationale: The data preparation phase in CRISP-DM (Cross-Industry Standard Process for Data
Mining) involves handling missing values, removing duplicates, normalizing scales, and creating
derived variables. This often consumes the majority of project time but is essential for building
effective models.
Question 16: What is the primary purpose of a control group in an A/B testing experiment?