Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

LATEST ISYE 6501 Spring '23 Exam 1 QUESTIONS WITH 100% SOLUTIONS 2024

Rating
-
Sold
-
Pages
6
Grade
A+
Uploaded on
21-02-2024
Written in
2023/2024

What is modeling? - ANSWER Describing a real-life situation mathematically, analyzing the math and then turning the math back into a real-life situation. What is a data point? - ANSWER A row of data. All of the information about one observation. What are some (4) names for columns of data? - ANSWER Attributes, features, covariates, predictors Name some common types of structured data (5) - ANSWER Quantitative, categorical, binary, unrelated, time series What is binary data? - ANSWER A subset of categorical data (although can be treated as numerical) that can only take on one value (Y/N, M/F, etc.) What is unstructured data? Example? - ANSWER Data that is not easily described or stored. Ex: Text. When do you use a soft classifier? - ANSWER When you can't draw a line to divide all data points. What are support vectors? - ANSWER Points supporting a shape on parallel lines. SVM - what does it mean if the coefficients are near zero? - ANSWER Those coefficients are probably not relevant for classification. SVM - Does a classifier need to be a straight line? - ANSWER No. Kernel methods allow for nonlinear classifiers. What is the most common scaling used? - ANSWER Scaling data between 0 and 1. What is standardization? - ANSWER Scaling data to a normal distribution (typically mean = 0, sd = 1) When might you use scaling over standardization? - ANSWER When your data needs to be in a bounded range. Ex: neural networks, optimization models, etc. What are two models that work better with standardization over scaling? - ANSWER Principle component analysis and clustering. How does KNN determine what a new point's class will be? - ANSWER The new data point's class is the most common class among the k neighbors What is the most common method to measure distance between k nearest neighbors? - ANSWER A straight line, although other methods can be used How can you adjust KNN for attributes that are more important for the classification? - ANSWER Weight the distances and give those attributes more weight What type of effects explain why we can't measure the model's effectiveness on training data? - ANSWER Random effects Why is the observed performance on high performing models probably too optimistic? - ANSWER High performing models are most likely to have benefited from random effects. Which set of data is used to choose the best model - ANSWER Validation set Which set of data is used for building models? - ANSWER Training set Which set of data is used for measuring a model's performance? - ANSWER Test set Name two ways to split the data into training and test sets - ANSWER Random and rotation Given a set of time series data, which might be a better way to split the data into training and test sets and why? Random or rotation? - ANSWER Rotation because rotation will spread the data equally and random could select more values from one year Which should we use the majority of our data for: training, validation or test? - ANSWER Training - most experts recommend using 50-70% for training and splitting the rest equally between validation and test. What is a solution to the problem of more important data points appearing in only validation or test sets? - ANSWER Cross validation How do you measure the model's quality using k-fold cross validation? - ANSWER Average the k evaluations to estimate the quality (k=10 is common) What are three advantages of k-fold cross validation? - ANSWER Better use of data, better estimate of model quality, choose your model more effectively In k-fold cross validation, how many times is each part of the data used for training and validation? - ANSWER k-1 times for training and once for validation. What type of model might you build to figure out where to build police stations? - ANSWER Clustering What are the three steps k-means clustering uses? - ANSWER 1. Pick k cluster centers 2. Assign each data point to nearest cluster center 3. recalculate cluster centers (centroids) - Repeat steps 2 and 3 until there are no more changes What is a heuristic? - ANSWER An algorithm that isn't guaranteed to find the best solution but usually gets close and does so quickly What is the best way to deal with an outlier? - ANSWER Find out more about it and what it means in the situation you're working on before deciding to remove it or keep it. What is a method to pick the number of clusters to use for k-means clustering? - ANSWER An elbow diagram. Find the point where the benefit of adding another cluster gets really small (the curve flattens) How do you use k-means for predictive analytics? - ANSWER Find the distance to the nearest cluster center and assign that point to that cluster.

Show more Read less
Institution
ISYE 6501
Course
ISYE 6501









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
ISYE 6501
Course
ISYE 6501

Document information

Uploaded on
February 21, 2024
Number of pages
6
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$10.49
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
contenthive76 Teachme2-tutor
Follow You need to be logged in order to follow users or courses
Sold
65
Member since
2 year
Number of followers
34
Documents
1929
Last sold
3 months ago

2.8

5 reviews

5
1
4
1
3
1
2
0
1
2

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions