A data-driven approach to reducing dropout rates at BHT Berlin
Background
With the increasing number of students in higher education, students drop-out or students failure becomes a problem concerning a greater number of individuals and institutions worldwide. This problem can lead to diverse consequences: student dissatisfaction, impact on funding model and reputation of the university.
Universities’ primary role is to educate each student in the best possible way. With increasing cohorts and increasing choices of degrees and paths within each degree, one size fit-all recommendations are no longer suitable. There is a need to take into account the situation of each individual and help them navigate in the best possible way throughout their studies.
Variouss stakeholders are regularly involved in the project:
Program heads
Dean of Studies Department VI
Data Protection Officer
Digitization Commission
Research Group ‘Computer Science Education / Computer Science and Society’ at Humboldt University
Students
Interns and doctoral students
Goal
This project aims to exploit the data that universities have about the academic achievements of their current and past students, to devise algorithms and to build tools to
identify the different paths that students follow in their curriculum,
better understand how these paths influence their progress, and
use this knowledge to help students who are in difficulty by providing informed personalized advice.
Publications
Design and Evaluation of Personalized Course Recommendations to Reduce Dropout Risk in Higher Education
Dropout rates in higher education are particularly high during the early semesters. Up to 47% of student dropouts in Germany occur within the first two semesters, with male students, first-generation students, and students with a migration background being especially affected. Key causes include examination failure and perceived high academic demands. This dissertation presents the design and evaluation of a course recommender system aimed at supporting students at risk of dropping out. The system design is based on significant performance differences between students who dropped out and students who graduated. The early involvement of students in the development process contributed to the creation of a transparent and interpretable recommendation system that is accessible to all students. The recommendation system employs an explainable nearest-neighbor algorithm and considers two success criteria: degree completion and passing a minimum number of courses per semester. The latter enables earlier feedback on success and allows the system to adapt more quickly to curriculum changes. An evaluation using historical data and various metrics shows, among other findings, that the number of recommended courses adapts to students’ performance levels, thereby preventing overload. Students at risk of dropping out are more likely to be recommended courses that are easier to pass, increasing their chances of passing and progressing academically. A survey of 100 students confirmed the comprehensibility of both the recommendations and their explanations, with no clear preference identified between presenting the recommended courses as a list or as a set. Overall, this work highlights the importance of explainable and flexible recommendation systems in higher education.
@phdthesis{wagner_design_2026,title={Design and {Evaluation} of {Personalized} {Course} {Recommendations} to {Reduce} {Dropout} {Risk} in {Higher} {Education}},copyright={(CC BY 4.0) Attribution 4.0 International, Creative Commons Attribution 4.0 International},url={https://edoc.hu-berlin.de/handle/18452/36497},note={https://edoc.hu-berlin.de/handle/18452/36497},doi={10.18452/35846},language={en},urldate={2026-05-08},school={Humboldt-Universität zu Berlin},author={Wagner, Kerstin},month=jan,year={2026},keywords={004 Informatik, Abbruchsgefährdete Studierende, Bildungsdatenanalyse, Course Recommender System, Educational Data Mining, Erklärbare Künstliche Intelligenz, Explainable Artificial Intelligence, Higher Education, Hochschulbildung, Kursempfehlungssystem, Learning Analytics, Lernanalyse, Students at Risk of Dropping out},projects={students_advice}}
A Course Recommender System Built on Success to Support at Risk Students in Higher Education
In this paper, we present an extended evaluation of a course recommender system designed to support students who struggle in the first semesters of their studies and are at risk of dropping out.
The system, which was developed in earlier work using a student-centered design, is based on the explainable k-nearest neighbor algorithm and recommends a set of courses that have been passed by the majority of successful neighbors, that is, students who graduated from the study program. In terms of the number of recommended courses, we found a discrepancy between the number of courses that struggling students are recommended to take and the actual number of courses they take. This indicates that there may be an alternative path that these students could consider. However, the recommended courses align well with the courses taken by students who successfully graduated. This suggests that even students who are performing well could still benefit from the course recommender system designed for students at-risk.
In the present work, we investigate a second type of success - a specific minimum number of courses passed - and compare the results with our first approach from previous work.
With the second type, the information about success might be already available after one semester instead of after graduation which allows faster growth of the database and faster response to curricular changes. The evaluation of three different study programs in terms of dropout risk reduction and recommendation quality suggests that course recommendations based on students passing at least three courses in the following semester can be an alternative to guide students on a successful path. The aggregated result data and results explorations are available at: https://kwbln.github.io/jedm23.
@article{wagner_course_2024,title={A {Course} {Recommender} {System} {Built} on {Success} to {Support} at {Risk} {Students} in {Higher} {Education}},volume={16},issn={2157-2100},url={https://doi.org/10.5281/zenodo.11384083},doi={10.5281/zenodo.11384083},number={1},journal={Journal of Educational Data Mining (JEDM)},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra and Pinkwart, Niels},year={2024},note={https://doi.org/10.5281/zenodo.11384083},pages={330--364},projects={students_advice}}
About the Perceived Quality of a Course Recommender System
In this work, we present a survey of a course recommender conducted among students and its results.
The course recommender system, published in our previours work (Wagner et al., 2023), is based on the nearest neighbors algorithm and aims to support students in their course enrollment; it targets above all students who did not pass all mandatory courses as indicated in the study handbook in their first or second semester at university.
The primary objective of the survey was to evaluate the perceived quality of explanations and recommendations based on two presentation variants (a ranked list of courses and a set of courses), as well as the general trust in such systems.
The survey included quantitative measures and demographic information from the students, so that different subgroups could be evaluated.
The results indicate that students tend to trust recommender systems and that they tend to understand the explanations.
No clear winner emerges between the presentation of the courses as a set and as a ranked list. The survey data explorations are available at: https://kwbln.github.io/csedu24.
@inproceedings{wagner_perceived_2024,title={About the {Perceived} {Quality} of a {Course} {Recommender} {System}},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra and Pinkwart, Niels},year={2024},booktitle={Proceedings of the 16th {International} {Conference} on {Computer} {Supported} {Education} ({CSEDU})},publisher={SciTePress},address={Angers, France},doi={10.5220/0012634900003693},projects={students_advice}}
Can the Paths of Successful Students Help Other Students With Their Course Enrollments?
In this paper, we present an extended evaluation of a course recommendation system to primarily support students who struggle in the first semesters of their studies and are at risk of dropping out.
The course recommendation system was developed in earlier work using a student-centered design and is based on the explainable k-Nearest Neighbor algorithm.
The system recommends the set of courses that have been passed by the majority of their nearest neighbors who have completed their studies.
The present evaluation is based on the data of students from three different study programs. The first result is that the recommendations do lower the dropout risk.
A second result is that the set of recommended courses matches quite well the courses taken by students who completed the study program but differs from the courses taken by struggling students.
Thus, though the course recommended system targets primarily struggling students, students doing well could use it.
A third result is that the number of recommended courses for struggling students is less than the number of courses they actually enrolled in.
We state that the recommendations given indicate a different and hopefully feasible path through the study program for students at risk of dropping out.
@inproceedings{wagner_can_2023,title={Can the {Paths} of {Successful} {Students} {Help} {Other} {Students} {With} {Their} {Course} {Enrollments}?},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra and Pinkwart, Niels},year={2023},month=jul,booktitle={Proceedings of the 16th {International} {Conference} on {Educational} {Data} {Mining}},pages={171–182},publisher={International Educational Data Mining Society},address={Begaluru, India},isbn={978-1-7336736-4-8},doi={10.5281/zenodo.8115719},url={https://zenodo.org/record/8115719},note={Nominated for Best Paper Award},projects={students_advice}}
Which Approach best Predicts Dropouts in Higher Education?
Kerstin Wagner, Henrik Volkening, Sunay Basyigit, Agathe Merceron, and
2 more authors
In Proceedings of the 15th International Conference on Computer Supported Education, Apr 2023
To predict whether students will drop out of their degree program in a middle-sized German university, we investigate five algorithms - three explainable and two not - along with two different feature sets.
It turns out that the models obtained with Logistic Regression (LR), an explainable algorithm, have the best performance.
This is an important finding to be able to generate explanations for stakeholders in future work.
The models trained with a local feature set and those trained with a global feature set show similar performance results.
Further, we study whether the models built with LR are fair with respect to both male and female students as well as the study programs considered in this study.
Unfortunately, this is not always the case.
This might be due to differences in the dropout rates between subpopulations.
This limit should be taken into account in practice.
@inproceedings{wagner_which_2023,title={Which {Approach} best {Predicts} {Dropouts} in {Higher} {Education}?},author={Wagner, Kerstin and Volkening, Henrik and Basyigit, Sunay and Merceron, Agathe and Sauer, Petra and Pinkwart, Niels},booktitle={Proceedings of the 15th {International} {Conference} on {Computer} {Supported} {Education}},publisher={SciTePress},address={Prague, Czech Republic},year={2023},month=apr,isbn={978-989-758-641-5},doi={10.5220/0011838100003470},url={https://www.scitepress.org/Link.aspx?doi=10.5220/0011838100003470},projects={students_advice}}
Personalized and Explainable Course Recommendations for Students at Risk of Dropping out [Poster]
This paper presents a course recommender system designed to support students who are struggling in their first semesters of university and who are at risk of dropping out.
Considering the needs expressed by our students, we recommend a set of courses that have been passed by the majority of their nearest neighbors who have successfully graduated.
We describe this recommender system, which is based on the explainable k-Nearest Neighbors algorithm, and evaluate the recommendations after the 1st and the 2nd semester using historical data.
The evaluation reveals that the recommendations correspond to the actual courses passed by students who graduated, whereas the recommendations and actually passed courses differ for students who dropped out.
The recommendations show to struggling students a different, ambitious, but hopefully feasible way through the study program.
Furthermore, a dropout prediction confirms that students are less likely to drop out when they pass the courses recommended to them.
@inproceedings{wagner_personalized_2022,title={Personalized and {Explainable} {Course} {Recommendations} for {Students} at {Risk} of {Dropping} out [{Poster}]},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra and Pinkwart, Niels},year={2022},month=jul,booktitle={Proceedings of the 15th {International} {Conference} on {Educational} {Data} {Mining}},publisher={International Educational Data Mining Society},address={Durham, United Kingdom},pages={657--661},doi={10.5281/zenodo.6853008 },isbn={978-1-73367-363-1 },url={https://doi.org/10.5281/zenodo.6853008 },editor={Mitrovic, Antonija and Bosch, Nigel},projects={students_advice}}
Eliciting Students’ Needs and Concerns about a Novel Course Enrollment Support System [Workshop]
Selecting courses that optimally fit a student’s situation can help reduce the risk of dropping out.
Data exploration and performance prediction approaches can be applied to help students make these decisions.
To ensure that an enrollment support system meets the needs of students, they should be involved as early as possible in the development process.
This paper presents an initial assessment of some functionalities of a novel course enrollment support system based on student performance data.
The results include a collection of indicators and sources of information, as well as an overview of needs and concerns.
The insights gathered will help to develop a system that has the trust of students.
@inproceedings{wagner_eliciting_2021,title={Eliciting {Students}' {Needs} and {Concerns} about a {Novel} {Course} {Enrollment} {Support} {System} [{Workshop}]},author={Wagner, Kerstin and Hilliger, Isabel and Merceron, Agathe and Sauer, Petra},year={2021},booktitle={Companion {Proceedings} of the 11th {International} {Conference} on {Learning} {Analytics} \& {Knowledge}},pages={294--304},url={https://www.solaresearch.org/core/lak21-companion-proceedings/},projects={students_advice}}
Investigating the Impact of Outliers on Dropout Prediction in Higher Education [Workshop]
Many institutions of higher education seek to reduce the dropout rate through the development of models which can detect students with a high risk of dropping out to provide specific advice for them.
Classical models usually ignore the students-outliers with uncommon and inconsistent characteristics although they may show significant information to domain experts and affect the prediction models.
The present paper provides an analysis of students’ performance and aims to answer the following research questions: What kinds of students-outliers can be detected? Do outliers affect dropout prediction models?
To answer the first question, students-outliers have been detected and their characteristics have been analyzed.
To address the second question, the dropout prediction models have been compared in terms of different algorithms and the presence of outliers in the data.
The results of the work indicate that the performance of prediction models, particularly in terms of recall, can be improved by removing outliers.
@inproceedings{novoseltseva_investigating_2021,title={Investigating the {Impact} of {Outliers} on {Dropout} {Prediction} in {Higher} {Education} [{Workshop}]},author={Novoseltseva, Daria and Wagner, Kerstin and Merceron, Agathe and Sauer, Petra and Jessel, Nadine and Sedes, Florence},year={2021},booktitle={Proceedings of {DELFI} {Workshops} 2021},publisher={Gesellschaft für Informatik e.V.z.},pages={120--129},isbn={978-3-946757-03-0 },url={https://nbn-resolving.org/urn:nbn:de:hbz:1393-opus4-7338},projects={students_advice}}
Accuracy of a Cross-Program Model for Dropout Prediction in Higher Education [Workshop]
Reducing dropout rates in higher education would allow increasing the number of graduates.
If one can predict early enough whether a student might drop out, targeted counseling could be put in place.
This work replicates the approach of Berens et al. (2019) to predict whether students might dropout using academic performance data from their first semester.
Further, the approach is extended by comparing the results of the cross-program model on specific programs of study with the results of the models trained for each specific program.
The findings support the generalization of the approach of Berens et al. (2019) to the German context, which could serve to establish best practices for dropout prediction in higher education.
@inproceedings{wagner_accuracy_2020,title={Accuracy of a {Cross}-{Program} {Model} for {Dropout} {Prediction} in {Higher} {Education} [{Workshop}]},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra},year={2020},month=mar,booktitle={Companion {Proceedings} of the 10th {International} {Learning} {Analytics} \& {Knowledge} {Conference} ({LAK} 2020)},pages={744--749},url={https://www.solaresearch.org/core/lak20-companion-proceedings/},projects={students_advice}}
Erste Untersuchungen zur Notenprognose für ein Kursempfehlungssystem [Workshop]
Kursempfehlungssysteme können den Studienerfolg unterstützen.
Eine wichtige Komponente eines solchen Systems ist die Prognose der Note, die Studierende bei Kursbelegung erwarten können.
In diesem Beitrag werden verschiedene Algorithmen zur Notenprognose eingesetzt und verglichen. Die Modelle der linearen Regression liefern die besseren Ergebnisse.
Darüber hinaus haben sie den Vorteil, nachvollziehbar zu sein, was Nutzende befähigt, die Grenzen des Modells besser einzuschätzen, und somit zu entscheiden, wie viel Vertrauen sie dem System schenken möchten.
@inproceedings{wagner_erste_2020,title={Erste {Untersuchungen} zur {Notenprognose} für ein {Kursempfehlungssystem} [{Workshop}]},author={Wagner, Kerstin and Merceron, Agathe and Sauer, Petra},year={2020},booktitle={Proceedings of {DELFI} {Workshops} 2020},publisher={Gesellschaft für Informatik e.V.z.},pages={103--112},doi={10.18420/delfi2020-ws-112 },url={http://dl.gi.de/handle/20.500.12116/34575},urldate={2021-07-13},projects={students_advice}}