Tentamen (uitwerkingen)

Final Exam Quiz: ISYE 6501/ ISYE6501, Introduction to Analytics Modeling (Complete Updated Fall 2025/26) | Quiz Score: 96.89 out of 100.04 - Georgia Institute Of Technology.

Beoordeling

Verkocht

Pagina's

Cijfer

Geüpload op

08-12-2025

Geschreven in

2025/2026

Final Exam Quiz: ISYE 6501/ ISYE6501, Introduction to Analytics Modeling (Complete Updated Fall 2025/26) | Quiz Score: 96.89 out of 100.04 - Georgia Institute Of Technology. Instructions for Questions 1-8 For each of the following eight questions, select the type of problem that the model is best suited for. Each type of problem might be used zero, one, or more than one time in the eight questions. Question 1 0.5 / 0.5 pts Select the type of problem that linear regression is best suited for. Classification Clustering Experimental design Prediction from feature data Prediction from time-series data Variable selection Question 2 0.5 / 0.5 ptsSelect the type of problem that fractional factorial design is best suited for. Classification Clustering Experimental design Prediction from time-series data Variable selection and/or prediction from feature data Question 3 0.5 / 0.5 pts Select the type of problem that lasso regression is best suited for. Classification Clustering Experimental design Prediction from time-series data Variable selection and/or prediction from feature dataQuestion 4 0.5 / 0.5 pts Select the type of problem that logistic regression is best suited for. Classification and/or prediction from feature data Clustering Experimental design Prediction from time-series data Variable selection Question 5 0.5 / 0.5 pts Select the type of problem that k-means is best suited for. Classification Clustering Experimental design Prediction from feature data Prediction from time-series dataVariable selection Question 6 0.5 / 0.5 pts Select the type of problem that support vector machine is best suited for. Classification and/or prediction from feature data Clustering Experimental design Prediction from time-series data Variable selection Question 7 0.5 / 0.5 pts Select the type of problem that GARCH is best suited for. Classification and/or prediction from feature data Clustering Experimental design Prediction from time-series data Variable selection Question 8 0.5 / 0.5 pts Select the type of problem that factorial design is best suited for. Classification and/or prediction from feature data Clustering Experimental design Prediction from time-series data Variable selection Instructions for Questions 9-16 For each of the following eight questions, select the type of analysis that the model is best suited for. Each type of analysis might be used zero, one, or more than one time in the eight questions.Question 9 0.63 / 0.63 pts Select the type of analysis that linear regression is best suited for. Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening and/or whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 10 0.63 / 0.63 pts Select the type of analysis that k-nearest-neighbor classification is best suited for. Using feature data to predict the amount and/or probability of something two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 11 0.63 / 0.63 pts Select the type of analysis that GARCH is best suited for. Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 12 0.63 / 0.63 pts Select the type of analysis that a k-nearest-neighbor classification tree is best suited for. Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 13 0.63 / 0.63 pts Select the type of analysis that a linear regression tree is best suited for.Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 14 0.63 / 0.63 pts Select the type of analysis that a random logistic regression forest is best suited for. Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening and/or whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 15 0.63 / 0.63 pts Select the type of analysis that ARIMA is best suited for. Using feature data to predict the amount of something two time periods in the future Using feature data to predict the probability of something happening two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the futureQuestion 16 0.63 / 0.63 pts Select the type of analysis that a support vector machine is best suited for. Using feature data to predict the amount and/or the probability of something two time periods in the future Using feature data to predict whether or not something will happen two time periods in the future Using time-series data to predict the amount of something two time periods in the future Using time-series data to predict the variance of something two time periods in the future Question 17 3 / 4 pts For each type of data, specify whether it is or is not time-series data. i. Characteristics of a day (day of the week, season, temperature, amount of rainfall, etc.) that might affect the number of burgers sold NOT TIME-SERIES ii. Fraction of burgers sold that had cheese, on each of the past 2000 days TIME-SERIESAnswer 1: Answer 2: Answer 3: Answer 4: iii. Number of burgers a restaurant sold on each of the past 2000 days TIME-SERIES iv. Number of toppings on each burger sold in the past 2000 days TIME-SERIES NOT TIME-SERIES TIME-SERIES TIME-SERIES You Answered You Answered TIME-SERIES Correct Answer Correct Answer NOT TIME-SERIES Question 18 4 / 4 pts Below are three statements about data that is scaled before point outliers are removed. For each statement, select the choice that makes the statement correct. i. If data is scaled first, the range of data after outliers are removed will be NARROWER than intended. ii. Point outliers WOULD NOT appear to be valid data if notAnswer 1: Answer 2: Answer 3: removed before scaling. iii. Valid data WOULD NOT appear to be outliers if data is scaled first. NARROWER WOULD NOT WOULD NOT Question 19 4 / 4 pts For each of the four situations below, specify whether using a variable selection approach like lasso or stepwise regression would be important. i. Time-series data is being used No, don't use variable selection ii. There are fewer data points than variables Yes, use variable selection iii. There are too few data points to avoid overfitting if all variables are included Yes, use variable selection iv. It is too costly to create a model with a large number of variables Yes, use variable selectionAnswer 1: Answer 2: Answer 3: Answer 4: No, don't use variable selection Yes, use variable selection Yes, use variable selection Yes, use variable selection Instructions for Questions 20-23 For each of the following four questions, select the type of model that the software package is best suited for analyzing. Each type of model might be used zero, one, or more than one time in the four questions. Question 20 1 / 1 pts Which type of model is SimPy is best suited for? Discrete-event simulationLinear regression Linear programming (optimization) Question 21 1 / 1 pts Which type of model is ARENA is best suited for? Discrete-event simulation Linear regression Linear programming (optimization) Question 22 1 / 1 pts Which type of model is R is best suited for? Discrete-event simulation Linear regression Linear programming (optimization)Question 23 1 / 1 pts Which type of model is PuLP is best suited for? Discrete-event simulation Linear regression Linear programming (optimization) Instructions for Questions 24-37 For each of the following 14 R functions, select the analytics task that the R function is directly suitable for. If the function does not do any of the choices, then select "none of the above". Each task might be used zero, one, or more than one time in the 14 questions. Question 24 0.5 / 0.5 pts train Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 25 0.5 / 0.5 pts prcomp Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 26 0.5 / 0.5 pts kmeans Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 27 0.5 / 0.5 pts ggplot Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 28 0.5 / 0.5 pts lm Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 29 0.5 / 0.5 pts cv Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 30 0.5 / 0.5 pts HoltWinters Cross-validation Graphing Holt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 31 0.5 / 0.5 pts ksvm Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 32 0.5 / 0.5 pts scale Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 33 0.5 / 0.5 pts glm Cross-validation GraphingHolt-Winters k-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine None of the other choices Question 34 0.5 / 0.5 pts kknn Cross-validation Graphing Holt-Wintersk-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 35 0.5 / 0.5 pts predict Cross-validation Graphing Holt-Wintersk-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 36 0.5 / 0.5 pts randomForest Cross-validation Graphing Holt-Wintersk-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Question 37 0.5 / 0.5 pts FrF2 Cross-validation Graphing Holt-Wintersk-means k-nearest-neighbor Linear regression Make predictions from models PCA Random forest Scale data Support vector machine Train various models None of the other choices Information for Questions 38-40 The following process was followed to predict sales of a product each month for the next three years: 1. Split past sales data randomly into three sets: training, validation, and test. 2. Build 20 different models using the training data. 3. Evaluate all 20 models on the validation data. 4. Select the model that performed best on the validation data. 5. Evaluate the selected model on the test data. 6. Use the selected model to predict monthly sales for the nextthree years based on real-time data, and observe its true performance. Question 38 1 / 1 pts Which of the following three statements is correct? Every model's expected performance on training data will be the same as its expected performance on the validation data, because both the training data and the validation data are taken from the same population. Every model's expected performance on training data will be worse than its expected performance on the validation data, because the training data and the validation data are different. Every model's expected performance on training data will be better than its expected performance on the validation data, because model fits partly to random patterns in the training data. Question 39 1 / 1 pts Which of the following three statements is correct? It is unclear how the selected model's expected performance on test data compares to its observed performance on real-time data, because the training data and the test data were taken from the same population, but the real-time data might be different. The selected model's expected performance on test data must be worse than its observed performance on real-time data, because the training data and test data were taken from the same population, but the real-time data might be different. The selected model's expected performance on test data must be better than its observed performance on real-time data, because the training data and test data were taken from the same population, but the real-time data might be different. Question 40 1 / 1 pts Which of the following three statements is correct? The selected model's expected performance on test data will be better than its expected performance on the validation data, because there is a selection bias: the selected model is more likely to have worse-than-average performance on random patterns in the validation data. The selected model's expected performance on test data will be the same as its expected performance on the validation data, because the validation data and the test data are the same. The selected model's expected performance on test data will be worse than its expected performance on the validation data, because there is a selection bias: the selected model is more likely to have better-than-average performance on random patterns in the validation data. Question 41 4 / 4 pts Answer 1: Answer 2: A positive correlation has been observed between number of police and amount of crime reported (where there are more police per capita more crime is reported, and where more crime is reported there are more police per capita). Based on that observed correlation, select all of the following statements about the direction of causality between police and crime reports that are true. A. Police cause crime reports: Where more police are working, citizens report more crime to them. NOT SELECTED AS TRUE B. Crime reports cause police: Where there is more crime reported, more police are hired to stop it. NOT SELECTED AS TRUE C. Both more police and more crime reports are positively correlated with another factor, which causes both. NOT SELECTED AS TRUE D. Can't tell without more analysis. SELECTED AS TRUE NOT SELECTED AS TRUEAnswer 3: Answer 4: NOT SELECTED AS TRUE NOT SELECTED AS TRUE SELECTED AS TRUE Question 42 4 / 4 pts Answer 1: For each of the four situations below, specify which would be better: including a "data missing" binary variable or imputing missing data. A. 2% of the data points have missing values, and you can build a good predictive model for the missing data. IMPUTE MISSING DATA B. 2% of the data points have missing values, and you cannot build a good predictive model for the missing data. "DATA MISSING" BINARY VARIABLE C. 50% of the data points have missing values for this variable, and you believe that points with missing data have a different distribution of values from points where data is present. "DATA MISSING" BINARY VARIABLE D. 50% of the data points have missing values for this variable, and you cannot build a good predictive model for the missing data. "DATA MISSING" BINARY VARIABLEAnswer 2: Answer 3: Answer 4: IMPUTE MISSING DATA "DATA MISSING" BINARY VARIABLE "DATA MISSING" BINARY VARIABLE "DATA MISSING" BINARY VARIABLE Instructions for Questions 43-46 For each of the following four questions, select the model that is more directly appropriate. Assume you have a relevant set of predictor data to use. Each type of model might be used zero, one, or more than one time in the four questions. Question 43 1 / 1 pts Which model is more directly appropriate to estimate the probability that a patient survives heart transplant surgery? Linear regression Logistic regression Question 44 1 / 1 pts Which model is more directly appropriate to estimate the amount of time it will take to process a certain loan application? Linear regression Logistic regression Question 45 1 / 1 pts Which model is more directly appropriate to forecast the number of hot dogs that will be sold at a baseball game? Linear regression Logistic regression Question 46 1 / 1 pts Which model is more directly appropriate to estimate the likelihoodthat a flight from Atlanta to Detroit will take more than two hours? Linear regression Logistic regression Instructions for Questions 47-49 For the situations in each of the following three questions, select whether a supervised learning model (like classification) is more directly appropriate than an unsupervised learning model (like clustering). Question 47 1 / 1 pts For each data point, the response is not known and there is no expert estimate. Supervised learning model (like classification) Unsupervised learning model (like clustering) Question 48 1 / 1 ptsFor each data point, the response is not known but an expert has provided an estimate of the response. Supervised learning model (like classification) Unsupervised learning model (like clustering) Question 49 1 / 1 pts For each data point, the response is known. Supervised learning model (like classification) Unsupervised learning model (like clustering) Question 50 4 / 4 pts An insurance company has data on past customers' attributes and how much each of their car insurance policies paid out. Now, the company wants to analyze the possibility of selling car insurance policies to new customers. For each of the following situations, specify which model is more appropriate: classification or linear regression. A. The insurance company wants to determine whether or not aAnswer 1: Answer 2: Answer 3: specific customer would switch to this company's coverage. CLASSIFICATION B. The insurance company wants to estimate the amount that a specific customer would be willing to pay for coverage. LINEAR REGRESSION C. The insurance company wants to estimate the number of car accidents a specific customer will get into in the next five years. LINEAR REGRESSION CLASSIFICATION LINEAR REGRESSION LINEAR REGRESSION Instructions for Questions 51-54 For each of the following four questions, select the model that is more directly appropriate. Assume you have a relevant set of predictor data to use. Each type of model might be used zero, one, or more than one time in the four questions. Question 51 1 / 1 ptsGiven distances and current and predicted travel speeds on each road, find the quickest way to drive from your current location to Georgia Tech if there are no unexpected delays. Which model is more directly appropriate? Simulation Optimization Question 52 1 / 1 pts Given the rates of people moving from room to room in a museum, times and routes to walk from one room to another, and capacities of rooms and hallways and doorways, find the maximum number of people the museum should allow inside so that congestion is unlikely. Which model is more directly appropriate? Simulation Optimization Question 53 1 / 1 pts Given the distributions of manufacturing time at each of 100 steps of a manufacturing process, and the probability of requiring reworkat each of the steps, estimate the distribution of the time it will take to produce 10,000 units of a product. Which model is more directly appropriate? Simulation Optimization Question 54 1 / 1 pts Given the weights and volumes of thousands of proposed scientific experiments that could be sent into space on the next private rocket launch, the amount of money each lab has offered to pay for its experiment to be included, and the capacity of the rocket, find the set of experiments that will maximize the income of the company launching the rocket. Which model is more directly appropriate? Simulation Optimization IMPORTANT INSTRUCTIONS FOR QUESTION 55 This question has six parts (A,B,C,D,E,F). Each part has several choices. For each choice, you must answerwhether you have "SELECTED" or "NOT SELECTED" that choice. If you leave a choice unanswered, it will be counted as wrong. Question 55 16.8 / 18 pts A trucking company that has focused service on 50 major cities nationally would like to make its network more efficient by having better predictions of day-to-day demand from each city to each other city, and then reallocating its resources (drives, vehicles, etc.) to better-meet that demand. This description is simplified from its real complexity; if you're an expert in the trucking industry, please do not rely on your expertise to fill in all that extra complexity (you'll end up making the questions more complex than I intended). A. Select all of the models/approaches the company could use to predict the city-to-city demand for each pair of cities on each day, based on daily city-to-city demand data for the past six years. i. ARIMA [ Select ] ii. Exponential smoothing [ Select ] iii. k-means[ Select ] iv. Linear regression [ Select ] v. Queuing [ Select ] B. Select all of the models/approaches the company could use to allocate its resources most efficiently from day to day, based on the demand estimate and on the costs of moving drivers and vehicles from one city to another. i. Elastic net [ Select ] ii. Linear regression NOT SELECTED iii. Optimization SELECTED iv. Principal component analysis NOT SELECTED v. Support vector machine NOT SELECTED Suppose the company wants to start service to a new city, and would like to estimate demand to and from the new city, based on attributes of each city and the known previous demand of the 50 cities the company currently serves. C. Select all of the models/approaches the company could use to estimate demand between the new city and each of the other 50 cities served.i. CUSUM NOT SELECTED ii. Discrete event simulation NOT SELECTED iii. Linear regression SELECTED iv. Logistic regression tree NOT SELECTED v. Random linear regression forest SELECTED Based on projected demand to/from the new city, the company needs to decide how many additional drivers to hire. The company has identified ten different scenarios that might happen (various demand levels, fuel prices, economic situations, etc.), and would like to find a hiring plan that has the best expected value, subject to some constraints. D. Select all of the following models/approaches that the company could use to determine its driver hiring plan. i. Discrete-event simulation, running thousands of replications with the same number of additional drivers in each replication SELECTED ii. Discrete-event simulation, running thousands of replications for each number of additional drivers NOT SELECTED iii. Dynamic programming, to suggest hiring decisions over time SELECTED iv. Markov chain, with the number of extra drivers hired as the states NOT SELECTED v. Stochastic optimization, to find the optimal number of drivers across scenarios SELECTED In the past, the company has grown by purchasing smallercompetitors, so it has a lot of data on small trucking companies: financial data, asset (trucks, drivers, etc.) data, demand data, and whether the small company was willing to be purchased. Now, the company would like to build a model to identify more small competitors who they might purchase, based on this data. E. Select all of the models/approaches the company could use to estimate which small competitors would accept a purchase offer. i. CUSUM NOT SELECTED ii. k-nearest-neighbor classification SELECTED iii. Logistic regression SELECTED iv. Multi-armed bandit NOT SELECTED v. Support vector machine SELECTED The company plans to buy television advertising time during the Super Bowl (which usually costs millions of dollars for a 30-second spot). F. Select all of the following models/approaches the company could use to test ads so it can choose the one that will increase demand the most. i. A/B testing (assuming there are two ads being tested) SELECTED ii. Fractional factorial design SELECTED iii. GARCH NOT SELECTED iv. Integer programming NOT SELECTED v. Multi-armed bandit (assuming there are more than two ads being testedAnswer 1: Answer 2: Answer 3: Answer 4: Answer 5: Answer 6: Answer 7: Answer 8: Answer 9: Answer 10: SELECTED SELECTED SELECTED NOT SELECTED NOT SELECTED NOT SELECTED NOT SELECTED NOT SELECTED SELECTED NOT SELECTEDAnswer 11: Answer 12: Answer 13: Answer 14: Answer 15: Answer 16: Answer 17: Answer 18: NOT SELECTED NOT SELECTED NOT SELECTED SELECTED NOT SELECTED SELECTED You Answered You Answered SELECTED Correct Answer Correct Answer NOT SELECTED Correct Answer Correct Answer SELECTED You Answered You Answered NOT SELECTED SELECTEDAnswer 19: Answer 20: Answer 21: Answer 22: Answer 23: Answer 24: Answer 25: Answer 26: Answer 27: Answer 28: NOT SELECTED SELECTED NOT SELECTED SELECTED SELECTED NOT SELECTED SELECTED SELECTED SELECTED NOT SELECTEDAnswer 29: Answer 30: NOT SELECTED SELECTED Information for Questions 56-60 Figure 2. Confusion matrix (Sensitivity 96.7%, Specificity 84.5%) A support vector machine model has been created to predict whether a person is right-handed or left-handed, based on the person's genetic profile. The figure above shows a confusion matrix of the model's performance on a test data set that it was not trained on. Instructions for Questions 56-59 For each of the following four questions, select the calculation that is most appropriate to support or refute the statement. Each calculation might be used zero, one, or more than one time in thefour questions. Question 56 1 / 1 pts Which calculation is most appropriate to support or refute the statement "if someone is right-handed, then the model is very likely to predict the person to be right-handed"? 948/(948+32)=96.7% 948/(948+991)=48.9% 5412/(5412+991)=84.5% 5412/(5412+32)=99.4% Question 57 1 / 1 pts Which calculation is most appropriate to support or refute the statement "if someone is left-handed, then the model is very likely to predict the person to be left-handed"? 948/(948+32)=96.7% 948/(948+991)=48.9%5412/(5412+991)=84.5% 5412/(5412+32)=99.4% Question 58 1 / 1 pts Which calculation is most appropriate to support or refute the statement "if the model predicts someone to be left-handed, then the person is very likely to be left-handed"? 948/(948+32)=96.7% 948/(948+991)=48.9% 5412/(5412+991)=84.5% 5412/(5412+32)=99.4% Question 59 1 / 1 pts Which calculation is most appropriate to support or refute the statement "if the model predicts someone to be right-handed, then the person is very likely to be right-handed"? 948/(948+32)=96.7%948/(948+991)=48.9% 5412/(5412+991)=84.5% 5412/(5412+32)=99.4% Question 60 2 / 2 pts Answer 1: Answer 2: In both sentences about using the model, select the option that makes each sentence most reasonable. i. When the model predicts left-handedness, it is more reasonable to remain undecided . ii. When the model predicts right-handedness, it is more reasonable to use the model's classification . remain undecided use the model's classificationIMPORTANT INSTRUCTIONS FOR QUESTION 61 This question has five parts (A,B,C,D,E). Each part has several choices. For each choice, you must answer whether you have "SELECTED" or "NOT SELECTED" that choice. If you leave a choice unanswered, it will be counted as wrong. Question 61 11.4 / 12 pts A large city is trying to plan its road construction so that its roads are appropriately-sized for when cars are all self-driving (autonomous). This description is simplified from its real complexity; if you're an expert in traffic, road-sizing, autonomous vehicles, etc., please do not rely on your expertise to fill in all the extra complexity (you'll end up making the questions below more difficult than I intended). Research has suggested that adding autonomous vehicles to the current system could significantly reduce traffic by reducing the variance in speed of both the autonomous and the human-driven vehicles. The city would like to use analytics to help determine how many lanes its highways and major streets will need to have in the future. A. The city's director of transportation has come up with thefollowing incorrect idea: GIVEN an estimate of the number of cars that will be in the city 10 years from now, USE logistic regression TO determine the fraction of them that will be autonomous. Then, GIVEN the number of human-driven and autonomous cars, USE simulation TO determine how many lanes will be required on each major street and highway 10 years from now. Select all of the statements below that show a reason why the director's idea is wrong. i. Logistic regression is not an appropriate model to determine a fraction or a probability. [ Select ] ii. No data has been specified on which to train the logistic regression model. [ Select ] iii. The simulation model needs information about where cars are going from and to, not just the total number of cars. [ Select ] iv. A simulation model is unable to take complex interactions (like those between drivers) into account. [ Select ] B. The director has come up with another incorrect idea:GIVEN the fraction of autonomous cars in the city over the past five years, USE exponential smoothing TO estimate the fraction of autonomous cars in the city in the next 10 years. Then, GIVEN that estimate and an expert estimate of how many of those cars will be going from each part of the city to each other part during rush hour, USE linear regression TO predict the number of lanes required on each major street and highway 10 years from now. Finally, GIVEN the number of cars going between parts of the city during rush hour and the number of lanes on each major street and highway, USE optimization TO find the quickest route for each car to use. Select all of the statements below that show a reason why the director's idea is wrong. i. Exponential smoothing is not an appropriate model to predict 10 data points in the future based on just 5 data points. [ Select ] ii. Given the small fraction of autonomous vehicles in the city in the past 5 years, exponential smoothing cannot capture the possibility of a major increase in the next 10 years. SELECTED iii. Linear regression cannot predict the number of lanes required based only on the number of cars going between parts of the city. SELECTED iv. Optimization cannot find the quickest route for a car to use. NOT SELECTED C. Select all of the possible paths below that could reasonably lead to a good solution. i. Estimate future population and job locations in various parts of the city. Then, based on those estimates, estimate the number ofcars going from each part of the city to each other part during rush hour. Estimate the fraction of cars that will be autonomous. Finally, determine how many lanes are required on each major street and highway to keep average commute time below 20 minutes. SELECTED ii. Obtain expert estimates of the number of human-driven and autonomous cars going from each part of the city to each other part during rush hour 10 years from now. Estimate the resulting traffic on each road. Determine how many lanes are required on each road to reduce that traffic so the average commute time is below 20 minutes. SELECTED iii. Estimate the fraction of cars that will be autonomous 10 years from now, and the number of cars that will be going from each part of the city to each other part during rush hour 10 years from now. Determine the maximum number of lanes that can be built based on budget estimates. Estimate the average commute time based on that number of lanes. If the average is above 20 minutes, plan to charge tolls for using certain congested roads during rush hour, and determine how to set those toll prices so that the average commute is below 20 minutes. SELECTED D. Select a set of models from the list below, that the director of transportation can put together to determine how large to make each street and highway. i. GIVEN past data on neighborhood growth and general citywide population forecasts, USE optimization TO estimate the population 10, 15, and 20 years from now in each area of the city. NOT SELECTED ii. GIVEN past data on neighborhood growth and general citywide population forecasts, USE linear regression (with anautoregressive component) TO estimate the population 10, 15, and 20 years from now in each area of the city. SELECTED iii. GIVEN future population estimates, past data on job locations around the city, and a variety of employment forecasts and expert forecasts about jobs that will be done from a worker's home in the future, USE simulation TO develop a range of scenarios for how many cars will be going from each part of the city to each other part during rush hour. SELECTED iv. GIVEN future population estimates, past data on job locations around the city, and a variety of employment forecasts and expert forecasts about jobs that will be done from a worker's home in the future, USE a random logistic regression forest TO develop a range of scenarios for how many cars will be going from each part of the city to each other part during rush hour. NOT SELECTED v. GIVEN a range of estimates of the number of cars going from each part of the city to each other part during rush hour, road size restrictions, an expert estimate of the fraction of autonomous cars, and a function that estimates traffic speed based on number of human-driven and autonomous cars and number of lanes, USE linear regression TO determine how many lanes each major street and highway should have to keep the average commute time below 20 minutes in 90% of the scenarios. NOT SELECTED vi. GIVEN a range of estimates of the number of cars going from each part of the city to each other part during rush hour, road size restrictions, an expert estimate of the fraction of autonomous cars, and a function that estimates traffic speed based on number of human-driven and autonomous cars and number of lanes, USE optimization TO determine how many lanes each major street and highway should have to keep the average commute time below 20 minutes in 90% of the scenarios.Answer 1: [ Select ] It has been predicted that if all cars are autonomous, they'll get along much better on the road (unlike humans, who sometimes cut in front of each other to get ahead, causing slowdowns behind them) and commute times will be reduced. Personally, I don't entirely agree; rather than having a whole set of nice, friendly autonomous cars, I can easily envision competing software companies that try to take advantage of the "nice, friendly" attributes of other companies' software. ["Buy our moreaggressive vehicle-driving software; it cuts in front of competitordriven cars 86% of the time, thereby reducing your commute times by over 25%!"] E. Select all of the following models/approaches that could be used to analyze some aspect of this possibility. i. Game-theoretic analysis, to model competitor behavior in the marketplace [ Select ] ii. Simulation, to determine the effect of adding more-aggressive cars to the system [ Select ] iii. McNemar's test on simulated drive times, after simulating each drive twice -- once with the "nice" software and once with the aggressive software [ Select ]Answer 2: Answer 3: Answer 4: Answer 5: Answer 6: Answer 7: Answer 8: Answer 9: Answer 10: Answer 11: NOT SELECTED SELECTED SELECTED NOT SELECTED SELECTED SELECTED SELECTED NOT SELECTED SELECTED SELECTEDAnswer 12: Answer 13: Answer 14: Answer 15: Answer 16: Answer 17: Answer 18: Answer 19: Answer 20: SELECTED NOT SELECTED SELECTED SELECTED NOT SELECTED NOT SELECTED SELECTED Correct Answer Correct Answer SELECTED You Answered You Answered NOT SELECTED SELECTED SELECTEDQuestion 62 5.65 / 6 pts In the United States in 2015, the overall population of 19-24-yearolds (about 27 million people) was approximately 49% women and 51% men. In the US college population of 19-24-year-olds (about 12 million people), 57% of college students were women and 43% were men. A. To test whether this discrepancy is significant, an analyst wants to use a binomial distribution. Which one of these choices would be an appropriate test? Select "YES" for the one that is appropriate and "NO" for each of the rest. i. Find the probability of 49% or more "yes" answers from a binomial distribution with n=27,000,000 and p=0.57. NO ii. Find the probability of 57% or more "yes" answers from a binomial distribution with n=27,000,000 and p=0.57. NO iii. Find the probability of 49% or more "yes" answers from a binomial distribution with n=12,000,000 and p=0.49. NO iv. Find the probability of 57% or more "yes" answers from a binomial distribution with n=12,000,000 and p=0.49. YES B. Select all of the approaches below that might help determinewhether there has been a significant change in the fraction of college students who are men and who are women over the past 50 years. Select "YES" for each that you choose, and "NO" for each that you don't choose. i. Classification with each year as a data point, using fraction of college students who are women as the response and the year as the predictor NO ii. CUSUM on the fraction of college students who are men, with each year as a data point YES iii. Exponential smoothing on the fraction of college students who are women, with each year as a data point YES iv. Logistic regression with each year as a data point, using fraction of college students who are men as the response and the year as the predictor NO One suggested explanation for the discrepancy is that there is a difference between girls' and boys' high school grades, partly due to boys' higher frequency of misbehavior. C. Which one nonparametric test could be used to check whether girls' and boys' median high school grades are significantly different? Select "YES" for the test you choose and "NO" for each of the rest. i. Paired-sample signed rank test NOii. McNemar's test NO iii. Two-sample unpaired rank test (Mann-Whitney) YES iv. One-sample signed rank test NO A logistic regression model shows that high school GPA is a significant predictor of whether a person will go to college. D. Select all of the statements below that could be a causal relationship between high school GPA and college attendance. Select "YES" for each that you choose, and "NO" for each that you don't choose. Base your answer only on the information above and the timing involved. [For the purpose of this question, do not judge whether the statements are true; instead, determine whether, if it were true, the statement would show a causal relationship explaining how GPA causes differences in the rates of women and men going to college.] i. Many colleges are less likely to admit students with a lower high school GPA. YES ii. Most community colleges will admit any high school graduate. NO iii. The same factors that cause boys to have lower high school GPAs might also make them less likely to want to attend college. NO iv. High school students who get higher GPAs do so because they are more serious about school, and therefore are more likely to want to attend college.Answer 1: Answer 2: Answer 3: Answer 4: Answer 5: NO v. Colleges believe that a higher high school GPA is a sign that a student is taking school seriously, and colleges prefer to admit serious students. YES Data for this question was taken from This discrepancy is getting more and more attention in education (and in education analytics); if you have any thoughts about it, please let me know! NO NO NO YESAnswer 6: Answer 7: Answer 8: Answer 9: Answer 10: Answer 11: Answer 12: Answer 13: Answer 14: NO YES YES Correct Answer Correct Answer YES You Answered You Answered NO NO NO YES NO YES NOAnswer 15: Answer 16: Answer 17: NO NO YES Question 63 0 / 0 pts Your Answer: Do you think that you or any of your fellow students in this course would be good TAs for the course in the future? If so, please enter name(s) or username(s) below. N/A Thanks for a great semester -- I really enjoyed teaching you all, and I hope you got a lot out of it too! Quiz Score: 96.89 out of 100.04

Meer zien Lees minder

Instelling

Vak

Voorbeeld van de inhoud

11/10/2025 Verified Learners | Final Quiz | ISYE6501x Courseware | edX

Quiz Score: 96.89 out of 100.04

ISYE 6501 Final Quiz 2025

, Instructions for Questions 1-8

For each of the following eight questions, select the type
of problem that the model is best suited for. Each type of
problem might be used zero, one, or more than one time
in the eight questions.

0..5 pts
Question 1

Select the type of problem that linear regression is best suited for.

Classification

Clustering

Experimental design

Correct!
Prediction from feature data

Prediction from time-series data

Variable selection

0..5 pts
Question 2

, Select the type of problem that fractional factorial design is best
suited for.

Classification

Clustering

Correct! Experimental design

Prediction from time-series data

Variable selection and/or prediction from feature data

0..5 pts
Question 3

Select the type of problem that lasso regression is best suited for.

Classification

Clustering

Experimental design

Prediction from time-series data

Correct!
Variable selection and/or prediction from feature data

, 0..5 pts
Question 4

Select the type of problem that logistic regression is best suited
for.

Correct! Classification and/or prediction from feature data

Clustering

Experimental design

Prediction from time-series data

Variable selection

0..5 pts
Question 5

Select the type of problem that k-means is best suited for.

Classification

Correct! Clustering

Experimental design

Prediction from feature data

Prediction from time-series data

Meld schending auteursrecht

Geschreven voor

Instelling: Georgia Institute Of Technology
Vak: ISYE 6501/ISYE6501 (ISYE6501)

Alle documenten voor dit vak (10)

Documentinformatie

Geüpload op: 8 december 2025
Aantal pagina's: 68
Geschreven in: 2025/2026
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

isye 6501 final quiz 2025
isye6501 final quiz 2025
georgia institute of technology isye
georgia institute of technology isye 6501 final
isye 6501 introduction to analytics modeling

$17.19

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

MindCraft

3.3

(32)

Maak kennis met de verkoper

MindCraft Nightingale College

Bekijk profiel

Volgen

Verkocht

236

Lid sinds

1 jaar

Aantal volgers

Documenten

2397

Laatst verkocht

1 dag geleden

All Academic Solutions 100% non -Ai.

Above all i'm here genuinely to help you in your course work. Do not hesitate to purchase or reach out to me, i'll absolutely get what you need. Get all latest solutions and answer keys, 100% non- ai, all the best.

3.3

32 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper MindCraft. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $17.19. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 53243 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

Final Exam Quiz: ISYE 6501/ ISYE6501, Introduction to Analytics Modeling (Complete Updated Fall 2025/26) | Quiz Score: 96.89 out of 100.04 - Georgia Institute Of Technology.

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?