Publication Details
Ouroboros: Early identification of at-risk students without models based on legacy data
Student Retention, Predictive Analytics, Self-Learning, Imbalanced data, Learning Analytics
This paper focuses on the problem of identifying students, who are at risk of failing their courses, with the absence of the data from previous runs of the courses, which are usually used for training the machine learning models. This problem is typically related to newly opened courses. To tackle this issue we present the "Ouroboros" based on the concept of "Self-Learning". It builds the machine learning models from the data in the currently running course. Moreover most of the students who fail the course withdraw in the first weeks of the course, therefore the focus is on identifying at-risk students as early as possible. The approach utilises information about already submitted assessments. This raises another problem that needs to be treated -- the presence of imbalanced data for training and testing the classification models. There are three main contributions in this paper -- 1) the concept of training the models for identifying at-risk students using the data from the same running course, 2) specifying the problem as a classification task and 3) tackling the problem of imbalanced data, which appears both in training and testing set. The results prove validity of the concept and show that it stands a comparison with traditional approaches of learning the models from legacy course data.
@inproceedings{BUT134240,
author="Martin {Hlosta} and Zdeněk {Zdráhal} and Jaroslav {Zendulka}",
title="Ouroboros: Early identification of at-risk students without models based on legacy data",
booktitle="LAK '17 Proceedings of the Seventh International Learning Analytics & Knowledge Conference",
year="2017",
pages="6--15",
publisher="Association for Computing Machinery",
address="Vancouver",
doi="10.1145/3027385.3027449",
isbn="978-1-4503-4870-6",
url="http://dl.acm.org/citation.cfm?id=3027449"
}