ERIC Number: ED630856
Record Type: Non-Journal
Publication Date: 2023
Pages: 12
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Investigating the Importance of Demographic Features for EDM-Predictions
Cohausz, Lea; Tschalzev, Andrej; Bartelt, Christian; Stuckenschmidt, Heiner
International Educational Data Mining Society, Paper presented at the International Conference on Educational Data Mining (EDM) (16th, Bengaluru, India, Jul 11-14, 2023)
Demographic features are commonly used in Educational Data Mining (EDM) research to predict at-risk students. Yet, the practice of using demographic features has to be considered extremely problematic due to the data's sensitive nature, but also because (historic and representation) biases likely exist in the training data, which leads to strong fairness concerns. At the same time and despite the frequent use, the value of demographic features for prediction accuracy remains unclear. In this paper, we systematically investigate the importance of demographic features for at-risk prediction using several publicly available datasets from different countries. We find strong evidence that including demographic features does not lead to better-performing models as long as some study-related features exist, such as performance or activity data. Additionally, we show that models, nonetheless, place importance on these features when they are included in the data--although this is not necessary for accuracy. These findings, together with our discussion, strongly suggest that at-risk prediction should not include demographic features. Our code is available at: https://anonymous.4open.science/r/edm-F7D1. [For the complete proceedings, see ED630829.]
Descriptors: Information Retrieval, Data Processing, Pattern Recognition, Information Technology, At Risk Students, Prediction, Accuracy, Foreign Countries, Student Characteristics, College Students, High School Students, Academic Achievement
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: Secondary Education; Higher Education; Postsecondary Education; High Schools
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Portugal; Colombia; Kuwait; Jordan
Grant or Contract Numbers: N/A
Author Affiliations: N/A