Logistic Regression Machine Learning Technique for academic predictions

Keywords: Simple Logistic Regression, Multiple Logistic Regression, Mean Squere Error (MSE), Coefficient of Determination (R2), Gradient Descent, Student


In this article, we use logistic regression as a machine learning technique to predict the probability that a first-semester student at the Francisco José de Caldas District University will fail or pass the calculus subject, and with what percentage.

To do this, make use of the student database available in the data repository of the FJC district university, see [1].

Initially, we used logistic regression using the student's entry age and the grade obtained in previous semesters, to predict the percentage with which a new student would obtain a passing grade according to the standards of the district university, which is three point zero (3.0).

As a second term we use multiple logistic regression, with it we use more than two input variables and the grade obtained by the students in the previous semester. With this we predict the grade of the student with a passing percentage.

Firstly, using simple logistic regression, we predict the probability that a student with a given score on the state tests (ICFES) will fail a first semester course at the FJC District University of Bogotá.

For this we have used Python and the keras and tensorflow libraries. To evaluate the efficiency of our model, we have analyzed the data using: The mean square error, Mean Squere error´ (MSE), The root of the MSE, Root Mean Square Error (RMSE) and the coefficient of determination (R2).´ We have also evaluated our model using the square root of the mean square error, RMSE: Root Mean Square' Error, which measures how well a regression line fits the data points.

For the management of the database, we worked on .cvs files and they were manipulated using NumPy and Pandas libraries for this purpose supplied by Python.



Download data is not yet available.


