Aim: - Perform Descriptive statistics on given data set.
This program analyzes student performance across four semesters using
Python and Pandas It provides both statistical insights and visualizations.
The dataset includes students' names, genders, Enrollment numbers,
semester marks, mobile numbers, and cities.
Key Features:
1. Overall, Class Performance → Boxplot distribution of semester marks.
2. Gender-based Performance → Comparison of average marks between
male and female students.
3. Semester-wise Progress → Line graph of class average performance
over semesters.
4. Subject-wise Strengths & Weaknesses → Bar chart showing each
student’s marks per semester.
5. Correlation Analysis → Heatmap showing correlation between semester
performances.
Outputs:
- Descriptive statistics in the console.
Code:
import pandas as pd
import matplotlib.pyplot as plt
# Dataset with given names
data = {
'Name': ['Dhruvil', 'Ridhhi', 'Divyesh', 'Yash', 'Om', 'Purva'],
'Gender': ['M', 'F', 'M', 'M', 'M', 'F'],
'EnrollmentNo': [101, 102, 103, 104, 105, 106],
'Semester1_Marks': [78, 85, 62, 90, 74, 88],
'Semester2_Marks': [82, 79, 70, 88, 69, 91],
, 'Semester3_Marks': [74, 92, 68, 84, 71, 89],
'Semester4_Marks': [80, 87, 72, 91, 76, 90],
'MobileNo': ['9991112222', '8882223333', '7773334444', '6664445555',
'5556667777', '4447778888'],
'City': ['Delhi', 'Mumbai', 'Chennai', 'Delhi', 'Kolkata', 'Pune']
}
# Create DataFrame
df = pd.DataFrame(data)
# Overall descriptive statistics
print("Overall Class Performance:")
print(df.describe())
# Gender-based average performance
gender_avg = df.groupby('Gender')
[['Semester1_Marks','Semester2_Marks','Semester3_Marks','Semester4_M
arks']].mean()
print("\nGender-based Performance:")
print(gender_avg)
# Semester-wise progress (class average)
semester_avg =
df[['Semester1_Marks','Semester2_Marks','Semester3_Marks','Semester4_
Marks']].mean()
print("\nSemester-wise Progress (Class Average):")
print(semester_avg)
# Correlation between semesters
,correlation =
df[['Semester1_Marks','Semester2_Marks','Semester3_Marks','Semester4_
Marks']].corr()
print("\nCorrelation between Semesters:")
print(correlation)
# Overall Class Performance (Boxplot)
plt.figure()
df[['Semester1_Marks','Semester2_Marks','Semester3_Marks','Semester4_
Marks']].boxplot()
plt.title("Overall Class Performance (Marks Distribution)")
plt.ylabel("Marks")
plt.savefig("overall_class_performance.jpeg", format='jpeg')
plt.show()
# Gender-based Performance
plt.figure()
gender_avg.T.plot(kind='bar')
plt.title("Gender-based Performance Comparison")
plt.xlabel("Semesters")
plt.ylabel("Average Marks")
plt.savefig("gender_based_performance.jpeg", format='jpeg')
plt.show()
# Semester-wise Progress Trend
plt.figure()
semester_avg.plot(marker='o')
plt.title("Semester-wise Progress (Class Average)")
plt.xlabel("Semester")
plt.ylabel("Average Marks")
, plt.savefig("semester_wise_progress.jpeg", format='jpeg')
plt.show()
# Subject-wise Strengths/Weaknesses
plt.figure()
df.set_index('Name')
[['Semester1_Marks','Semester2_Marks','Semester3_Marks','Semester4_M
arks']].plot(kind='bar')
plt.title("Subject-wise Strengths and Weaknesses")
plt.xlabel("Students")
plt.ylabel("Marks")
plt.savefig("subject_strengths_weaknesses.jpeg", format='jpeg')
plt.show()
# Correlation Heatmap
plt.figure()
plt.imshow(correlation, cmap='coolwarm', interpolation='none')
plt.colorbar(label='Correlation Coefficient')
plt.xticks(range(len(correlation)), correlation.columns, rotation=45)
plt.yticks(range(len(correlation)), correlation.columns)
plt.title("Correlation between Academic Periods")
plt.savefig("correlation_heatmap.jpeg", format='jpeg')
plt.show()
Output:
1. Console: