This project implements and compares four gradient descent optimization techniquesβBatch Gradient Descent, Stochastic Gradient Descent, Mini-Batch Gradient Descent, and Momentum Gradient Descentβto predict student performance based on various academic and lifestyle features such as study hours, previous scores, sleep duration, and more.
The goal is to understand how different gradient descent methods perform in optimizing a linear regression model and how they converge over time. The model is trained to predict the Performance Index
of students using the following features:
- Hours Studied
- Previous Scores
- Sleep Hours
- Sample Question Papers Practiced
The dataset used is Student Performance
, which contains 10,000 rows and the following columns:
Hours Studied
Previous Scores
Extracurricular Activities
Sleep Hours
Sample Question Papers Practiced
Performance Index
(Target)
- Batch Gradient Descent (BGD)
- Stochastic Gradient Descent (SGD)
- Mini-Batch Gradient Descent (MBGD)
- Momentum Gradient Descent (MGD)
Each method is implemented from scratch using NumPy, and performance is evaluated based on cost function convergence and RMSE (Root Mean Squared Error) on the test set.
The project includes visualizations of:
- Feature distributions and pairwise relationships
- Correlation heatmap
- Cost convergence for each gradient descent method
- Batch GD: 2.0735
- Stochastic GD: 2.0777
- Mini-Batch GD: 2.0471
- Momentum GD: 2.0451
NumPy
Pandas
Matplotlib
Seaborn
Scikit-learn
- Gradient Descent is a fundamental concept in optimization and machine learning.
- Different methods have trade-offs in speed, accuracy, and convergence stability.
- Momentum and Mini-Batch GD showed slightly better performance in this context.