Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

Exam predictive analytics SOA Questions and Answers.

Rating
-
Sold
-
Pages
7
Grade
A+
Uploaded on
17-03-2026
Written in
2025/2026

Data - Answer Facts and statistics collected together for reference or analysis Generalized linear model (GLM) - Answer a statistical technique that increases the flexibility of a linear model by linking it with a nonlinear function principle component analysis - Answer A type of factor analysis used to identify the most independent variables and their relative strength/position. Factor Variable vs keeping it numeric - Answer If the different "categories" have no sense of order then it's obviously a factor. The second question is wether there is a potentially compelling reason why the differences in the numerical values are meaningful. (difference in hours could be useful in predicting y throughout the day but fall coming after spring may not be helpful) what is an interaction and how do you justify it - Answer an interaction is when the dependency of the target variable on a predictor is itself dependent on a third variable (i.e. using both predictors creates a result that wouldn't be expected based on just them both independently.) using boxplots and descriptive statistics to show differences in the mean and median overall and when the interaction occurs k-means clustering - Answer unsupervised learning with the goal of assigning records to the group (one of the k created) that it is most similar to. k is specified at the beginning and random observations are assigned as the starting centroids of each cluster. From there all observations are assigned to a cluster, new centroids are calculated. The process of assigning observations to a cluster and recalculating centroids is repeated until the assignments are stable (the

Show more Read less
Institution
MISY5370-Data Mining & Predictive Analytics
Course
MISY5370-Data Mining & Predictive Analytics

Content preview

Exam predictive analytics SOA
Questions and Answers.
Data - Answer Facts and statistics collected together for reference or analysis



Generalized linear model (GLM) - Answer a statistical technique that increases the flexibility
of a linear model by linking it with a nonlinear function



principle component analysis - Answer A type of factor analysis used to identify the most
independent variables and their relative strength/position.



Factor Variable vs keeping it numeric - Answer If the different "categories" have no sense of
order then it's obviously a factor. The second question is wether there is a potentially
compelling reason why the differences in the numerical values are meaningful. (difference in
hours could be useful in predicting y throughout the day but fall coming after spring may not be
helpful)



what is an interaction and how do you justify it - Answer an interaction is when the
dependency of the target variable on a predictor is itself dependent on a third variable (i.e.
using both predictors creates a result that wouldn't be expected based on just them both
independently.)

using boxplots and descriptive statistics to show differences in the mean and median overall and
when the interaction occurs



k-means clustering - Answer unsupervised learning with the goal of assigning records to the
group (one of the k created) that it is most similar to. k is specified at the beginning and random
observations are assigned as the starting centroids of each cluster. From there all observations
are assigned to a cluster, new centroids are calculated. The process of assigning observations to
a cluster and recalculating centroids is repeated until the assignments are stable (the



Correlation Matrix (Rcode) - Answer cor(dataframe[, sapply(dataframe, is.numeric)]). which
passes all rows and the numeric columns into the cor function



Density Charts (Rcode) - Answer # Density charts for numeric variables

data.all %>%

select_if(is.numeric) %>%

gather() %>% # Make key value pairs that allows the use of facet_wrap

ggplot(aes(value)) +

facet_wrap(~key, scales = "free") +

, geom_density()



creating a table (rcode) - Answer table(dataframe$desiredrow, dataframe$desiredcolumn



get descriptive stats from a df subset (rcode) - Answer data.all %>%

filter(numeric >= 5 & numeric <= 9) %>%

group_by_("factor") %>%

summarise(

mean = mean(bikes_per_hour),

median = median(bikes_per_hour),

n = n()

)



Describe and interpret a K means elbow plot - Answer "In an elbow plot the proportion of
variance explained by the variance between the clusters is calculated and plotted for sucessive
values of k. increases in k generally lead to diminishing increases in the PVE until it creates an
"elbow" where the proportion of variance explained is reduced by the addition of a new cluster.
The cluster just before this reduction is the one that is generally considered a good choice.



when should and shouldn't you use the cluster assignment replace OG variables - Answer
when the intracluter dissimilarity is relatively low and the groups are separate from other
groups. if most of your clusters are just partitions of a big blob of data then it's probably not a
good idea to replace. If they appear to be finding clumps then it may be valuable. Finally you
may be reducing dimensions/eliminating a continuous relationship which can be good and bad



Describe Bias - Answer Expected loss caused by model not being complex enough to capture
the signal in the data



Describe Variance - Answer Expected loss caused by the model being too complex and
overfitting the data



Describe the bias variance trade off - Answer Total loss of a model is bias + variance +
unavoidable error with variance increasing with model complexity/additional predictors and
bias forming a parabola centered at the point where the model is complex enough to accurately
portray the signal in the data but not so complex that it overfits the model. That said as a
general statement we think of bias and variance as a trade off whereby reducing variance you
gain more bias and vice versa. When comparing different models their performance on the test
set is the best way to see

Written for

Institution
MISY5370-Data Mining & Predictive Analytics
Course
MISY5370-Data Mining & Predictive Analytics

Document information

Uploaded on
March 17, 2026
Number of pages
7
Written in
2025/2026
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$12.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF


Also available in package deal

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
COCOSOLUTIONS Nursing
Follow You need to be logged in order to follow users or courses
Sold
188
Member since
2 year
Number of followers
16
Documents
8105
Last sold
2 days ago
COCO SOLUTIONS ACADEMIC STORE

COCO SOLUTIONS ACADEMIC STORE YOU GET ALL KIND OF EXAMS,STUDYGUIDES,ASSIGNMENTS,FLASHCARDS,NOTES,SUMMARIES,REVIEWS .ALL YOUR ACADEMIC SOLUTIONS WE GOT YOU COVERED.WE ARE YOUR STUDY SOLUTION ,MAKING YOUR EDUCATION JOURNEY SMOOTH AND EFFICIENT FOR MORE ENQUIRIES FEEL FREE TO REACH US OUT.

4.1

31 reviews

5
16
4
6
3
7
2
1
1
1

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions