MISY 5380 PRACTICE EXAM QUESTIONS AND
DETAILED SOLUTIONS 2026
▶ Line chart? Answer:A line connects the points in the chart.
Useful for time series data collected over a period of time. (minutes, hours,
days, years, etc.)
▶ Sparkline? Answer:Minimalist type of line chart that can be placed
directly into a cell in Excel.
Contain no axes; they display only the line for the data.
▶ Bar Charts? Answer:Use horizontal bars to display the magnitude of the
quantitative variable.
▶ Column Charts? Answer:Use vertical bars to display the magnitude of
the quantitative variable
▶ Pie charts? Answer:Common form of chart used to compare categorical
data.
▶ Bubble chart? Answer:Graphical means of visualizing three variables in
a two-dimensional graph.
Sometimes a preferred alternative to a 3-D graph
▶ Heat map? Answer:A two-dimensional graphical representation of data
that uses different shades of color to indicate magnitude
▶ Stacked column chart? Answer:Allows the reader to compare the relative
values of quantitative variables for the same category in a bar chart
▶ Clustered column (or bar) chart? Answer:An alternative chart to stacked
column chart for comparing quantitative variables
▶ Scatter chart matrix? Answer:Useful chart for displaying multiple
variables.
,▶ Geographic Information Systems (GIS)? Answer:A system that merges
maps and statistics to present data collected over different geographies
▶ Data dashboard? Answer:Data visualization tool that illustrates multiple
metrics and automatically updates these metrics as new data become
available
▶ Supervised learning? Answer:Data Mining approach for prediction and
classification
▶ Unsupervised learning? Answer:Data Mining approach for to detect
patterns and relationships in the data
▶ Data Sampling? Answer:When dealing with large volumes of data, it is
best practice to extract a representative sample for analysis.
A sample is representative, if the analyst can make the same conclusions
from it as from the entire population of data.
The sample of data must be large enough to contain significant information,
yet small enough to be manipulated quickly.
▶ Data Preparation? Answer:The data in a data set are often said to be
"dirty" and "raw" before they have been preprocessed.
We need to put them into a form that is best suited for a data-mining
algorithm.
Data preparation makes heavy use of the descriptive statistics and data
visualization methods.
▶ Unsupervised learning application? Answer:The goal is to use the
variable values to identify relationships between observations.
Qualitative assessments, such as how well the results match expert
judgment, are used to assess unsupervised learning methods.
▶ Cluster Analysis? Answer:The goal of this unsupervised learning method
is to segment observations into similar groups based on the observed
variables
Can be employed during the data preparation step to identify variables or
observations that can be aggregated or removed from consideration
▶ Types of Clustering Methods? Answer:Hierarchical and K-Means
, ▶ Euclidean distance? Answer:Most common method to measure
dissimilarity between observations, when observations include continuous
variables
▶ Hierarchical clustering? Answer:Bottom-up approach
Determines the similarity of two clusters by considering the similarity
between the observations composing either cluster
▶ Single linkage? Answer:The similarity between two clusters is defined by
the similarity of the pair of observations (one from each cluster) that are the
most similar
▶ Complete linkage? Answer:This clustering method defines the similarity
between two clusters as the similarity of the pair of observations (one from
each cluster) that are the most different
▶ Average linkage? Answer:Defines the similarity between two clusters to
be the average similarity computed over all pairs of observations between
the two clusters
▶ Ward's method? Answer:Computes dissimilarity as the sum of the
squared differences in similarity between each individual observation in the
union of the two clusters and the centroid of the resulting merged cluster
▶ k-Means clustering? Answer:Given a value of k, the k-means algorithm
randomly partitions the observations into k clusters.
After all observations have been assigned to a cluster, the resulting cluster
centroids are calculated.
Using the updated cluster centroids, all observations are reassigned to the
cluster with the closest centroid
▶ Association rules? Answer:if-then statements
Convey the likelihood of certain items being purchased together
▶ Antecedent? Answer:The collection of items (or item set) corresponding
to the if portion of the rule
▶ Consequent? Answer:The item set corresponding to the then portion of
the rule
DETAILED SOLUTIONS 2026
▶ Line chart? Answer:A line connects the points in the chart.
Useful for time series data collected over a period of time. (minutes, hours,
days, years, etc.)
▶ Sparkline? Answer:Minimalist type of line chart that can be placed
directly into a cell in Excel.
Contain no axes; they display only the line for the data.
▶ Bar Charts? Answer:Use horizontal bars to display the magnitude of the
quantitative variable.
▶ Column Charts? Answer:Use vertical bars to display the magnitude of
the quantitative variable
▶ Pie charts? Answer:Common form of chart used to compare categorical
data.
▶ Bubble chart? Answer:Graphical means of visualizing three variables in
a two-dimensional graph.
Sometimes a preferred alternative to a 3-D graph
▶ Heat map? Answer:A two-dimensional graphical representation of data
that uses different shades of color to indicate magnitude
▶ Stacked column chart? Answer:Allows the reader to compare the relative
values of quantitative variables for the same category in a bar chart
▶ Clustered column (or bar) chart? Answer:An alternative chart to stacked
column chart for comparing quantitative variables
▶ Scatter chart matrix? Answer:Useful chart for displaying multiple
variables.
,▶ Geographic Information Systems (GIS)? Answer:A system that merges
maps and statistics to present data collected over different geographies
▶ Data dashboard? Answer:Data visualization tool that illustrates multiple
metrics and automatically updates these metrics as new data become
available
▶ Supervised learning? Answer:Data Mining approach for prediction and
classification
▶ Unsupervised learning? Answer:Data Mining approach for to detect
patterns and relationships in the data
▶ Data Sampling? Answer:When dealing with large volumes of data, it is
best practice to extract a representative sample for analysis.
A sample is representative, if the analyst can make the same conclusions
from it as from the entire population of data.
The sample of data must be large enough to contain significant information,
yet small enough to be manipulated quickly.
▶ Data Preparation? Answer:The data in a data set are often said to be
"dirty" and "raw" before they have been preprocessed.
We need to put them into a form that is best suited for a data-mining
algorithm.
Data preparation makes heavy use of the descriptive statistics and data
visualization methods.
▶ Unsupervised learning application? Answer:The goal is to use the
variable values to identify relationships between observations.
Qualitative assessments, such as how well the results match expert
judgment, are used to assess unsupervised learning methods.
▶ Cluster Analysis? Answer:The goal of this unsupervised learning method
is to segment observations into similar groups based on the observed
variables
Can be employed during the data preparation step to identify variables or
observations that can be aggregated or removed from consideration
▶ Types of Clustering Methods? Answer:Hierarchical and K-Means
, ▶ Euclidean distance? Answer:Most common method to measure
dissimilarity between observations, when observations include continuous
variables
▶ Hierarchical clustering? Answer:Bottom-up approach
Determines the similarity of two clusters by considering the similarity
between the observations composing either cluster
▶ Single linkage? Answer:The similarity between two clusters is defined by
the similarity of the pair of observations (one from each cluster) that are the
most similar
▶ Complete linkage? Answer:This clustering method defines the similarity
between two clusters as the similarity of the pair of observations (one from
each cluster) that are the most different
▶ Average linkage? Answer:Defines the similarity between two clusters to
be the average similarity computed over all pairs of observations between
the two clusters
▶ Ward's method? Answer:Computes dissimilarity as the sum of the
squared differences in similarity between each individual observation in the
union of the two clusters and the centroid of the resulting merged cluster
▶ k-Means clustering? Answer:Given a value of k, the k-means algorithm
randomly partitions the observations into k clusters.
After all observations have been assigned to a cluster, the resulting cluster
centroids are calculated.
Using the updated cluster centroids, all observations are reassigned to the
cluster with the closest centroid
▶ Association rules? Answer:if-then statements
Convey the likelihood of certain items being purchased together
▶ Antecedent? Answer:The collection of items (or item set) corresponding
to the if portion of the rule
▶ Consequent? Answer:The item set corresponding to the then portion of
the rule