NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1345841
Record Type: Journal
Publication Date: 2021
Pages: 7
Abstractor: ERIC
ISBN: N/A
ISSN: ISSN-1539-9664
EISSN: EISSN-1539-9672
Available Date: N/A
Big Data on Campus: Putting Predictive Analytics to the Test
Bird, Kelli A.; Castleman, Benjamin L.; Song, Yifeng; Mabel, Zachary
Education Next, v21 n4 p58-64 Fall 2021
An estimated 1,400 colleges and universities nationwide have invested in predictive analytics technology to identify which students are at risk of failing courses or dropping out, with spending estimated in the hundreds of millions of dollars. How accurate and stable are those predictions? The authors put six predictive models to the test to gain a fuller understanding of how they work and the tradeoffs between simpler versus more complex approaches: (1) Ordinary Least Squares; (2) Logistic Regression; (3) Cox Proportional Hazard Survival Analysis; (4) Random Forest; (5) XGBoost; and (6) Recurrent Neural Networks. The study uses detailed student data from the Virginia Community College System to investigate whether models accurately predict whether a student does or does not graduate with a college-level credential within six years of entering school. Using these same models, they also examine, for a given student, whether their predicted risk of dropping out is the same from one model to the next. The findings show that complex machine-learning models aren't necessarily better at predicting students' future outcomes than simpler statistical techniques. The authors also find that the dropout risk predictions assigned to a given student are not stable across models. This volatility is particularly pronounced when more complex machine-learning models are used to generate predictions, as those approaches are more sensitive to which predictors are included in the models and which students and institutions are included in the sample. Finally, the authors show that students from underrepresented groups, such as Black students, have a lower predicted probability of graduating than students from other groups. While this could potentially lead underrepresented students to receive additional support, the experience of being labeled "at risk" could exacerbate concerns these students may already have about their potential for success in college. Addressing this potential hazard is not as straightforward as just removing demographic predictors from predictive models, which was found to have no effect on model performance. The most influential predictors of college completion, such as semester-level GPA and credits earned, are correlated with group membership, owing to longstanding inequities in the educational system. The findings raise important questions for institutions and policymakers about the value of investments in predictive analytics.
Education Next Institute, Inc. Harvard Kennedy School, Taubman 310, 79 JFK Street, Cambridge, MA 02138; Fax: 617-496–4428; e-mail: Education_Next@hks.harvard.edu; Web site: https://www.educationnext.org/the-journal/
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education; Postsecondary Education; Two Year Colleges
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Virginia
Grant or Contract Numbers: N/A
Author Affiliations: N/A