Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Overig

ISYE 7406 Homework 5 Review 2 | Verified Study set complete Solutions | A+ Graded | 2026 Updates | 100% correct

Beoordeling
-
Verkocht
-
Pagina's
24
Geüpload op
24-04-2026
Geschreven in
2025/2026

ISYE 7406 Homework 5 Review 2 | Verified Study set complete Solutions | A+ Graded | 2026 Updates | 100% correct

Instelling
Vak

Voorbeeld van de inhoud

ISYE7406 HW5


Introduction
Student placement prediction is a common application of statistical learning
methods, as it involves identifying factors that influence whether a student secures
employment. Understanding these factors can provide useful insights for both
students preparing for job placement and organizations aiming to improve
candidate selection.

In this study, I analyze a Student Placement Dataset consisting of 10,000
observations and multiple predictor variables describing students’ academic
performance, skills, and background characteristics. The response variable is
placement status, a binary outcome indicating whether a student is placed or not.

The primary objective of this report is to apply and compare a range of statistical
learning methods for predicting placement outcomes. These include baseline
approaches such as K-Nearest Neighbors (KNN), and Naïve Bayes, as well as more
advanced ensemble methods including Random Forest and Gradient Boosting
Machine (GBM).

All models are trained on a training set and evaluated on a held out test set. Model
performance is compared in terms of test error and classification accuracy. In
addition, hyperparameters for each method are selected using cross validation on
the training set to ensure a fair and unbiased comparison.

Through this analysis, we aim to identify which modeling approach achieves the
best predictive performance and to understand the strengths and limitations of
different methods when applied to this dataset.



Exploratory Data Analysis
The dataset used in this project was obtained from Kaggle:
https://www.kaggle.com/datasets/rakesh630/global-student-placement-2025-
dataset/data

The dataset consists of 10,000 student records with 11 predictor variables,
capturing both institutional characteristics and individual level attributes. These
variables include university related information, such as country, college tier, and

,university ranking band, as well as student specific factors including academic
performance, internship experience, skill assessments, and specialization.

The dataset contains two outcome variables: placement status, a binary indicator of
whether a student was Placed or Not Placed, and salary. The salary variable was
excluded from the analysis because it is only observed for placed students and is
missing for all non placed observations (3,848 cases). As salary reflects a post-
placement outcome rather than a pre- placement characteristic, including it may
introduce bias into the predictive models.

To facilitate analysis, the predictors were grouped into several categories.
Institutional characteristics include college tier and university ranking band, which
reflect the overall quality and reputation of the educational institution. Academic
metrics consist of CGPA and backlogs, capturing students’ academic performance.
Experience and skill related variables include internship count, internship quality
score, aptitude score, and communication score. Field of study and industry
alignment are represented by specialization and industry, while geographic
information is captured by country.

The dataset includes amix of numerical variables such as CGPA, aptitude score,
communication score and categorical variable including specialization, industry,
and college tier, providing a diverse set of features for modeling.




Figure 1. Summary Statistics of the Dataset

From the summary statistics, CGPA ranges from 4 to 10, with a mean of
approximately 7, and appears to follow an approximately normal distribution. In
contrast, both backlogs and internship count exhibit right skewed distributions. This
is consistent with expectations, as most students have relatively few backlogs
(median = 1), indicating that academic failure is uncommon. Similarly, internship
counts are generally low, suggesting that many students have limited practical
experience.

, Figure 2. Histogram of Placement Status

The target variable, placement_status, exhibits a moderate class imbalance:
approximately 61.5% of students are placed, while 38.5% are not placed.




Figure 3. Boxplot and Density Plot of CGPA by Placement Status

The density plot indicates that placed students generally have higher CGPA values,
with a peak around 7.5, whereas non placed students exhibit a peak closer to 6.5.
The boxplot further supports this pattern, showing that the median CGPA for placed
students is higher than that of non placed students.

Both groups contain some outliers. For non placed students, CGPA ranges from
approximately 4.0 to above 9.0, while for placed students the range extends from
about 4.7 to 10.0. These outliers are not removed, as a high CGPA does not
guarantee job placement. In real world hiring processes, multiple factors are
considered beyond academic performance alone.

Geschreven voor

Instelling
Vak

Documentinformatie

Geüpload op
24 april 2026
Aantal pagina's
24
Geschreven in
2025/2026
Type
OVERIG
Persoon
Onbekend

Onderwerpen

$15.99
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF


Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
EduSprint Chamberlain College Of Nursing
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
50
Lid sinds
2 jaar
Aantal volgers
5
Documenten
6821
Laatst verkocht
1 week geleden
Elite Nursing Exams Hub

WGU A+ Vault fore more info

4.3

6 beoordelingen

5
4
4
0
3
2
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen