NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED656688
Record Type: Non-Journal
Publication Date: 2018
Pages: 246
Abstractor: As Provided
ISBN: 979-8-3828-7711-2
ISSN: N/A
EISSN: N/A
Available Date: N/A
Aiming for Success: Evaluating Statistical and Machine Learning Methods to Predict High School Student Performance and Improve Early Warning Systems
David M. Alexandro
ProQuest LLC, Ph.D. Dissertation, University of Connecticut
In response to the high school dropout crisis, which comes with great economic and social costs, early warning systems (EWSs) have been developed to systematically predict and improve student outcomes. The purpose of this study is to evaluate different statistical and machine learning methods to predict high school student performance and improve EWSs. By improving education EWSs, this study aims to better identify those students in need of targeted support and inform on-the-ground practitioners who may intervene long before students may be dropping out. The current study explores the aforementioned methods in the context of a cohort of 40,008 Connecticut students. The study utilized more than 100 predictors and developed models to predict each student's probability of being on-track to graduate within four years using data collected prior to a student's entry into 9th grade. Random forests, classification and regression tree (CART, or decision tree), and regularized logistic regression--ridge, lasso, and elastic net--models were developed, and performance of the models was evaluated on a validation dataset by comparing classification accuracy measures. The study revealed that random forests models developed using a training set balanced by oversampling did the best job of identifying which students are at risk. These models captured complex interactions among covariates and performed best when thresholds were optimized using Youden's index rather than defaulted at a 0.5 cut-off. The variable importance rankings showed that standardized test scores, attendance, and course performance were the top-ranking predictors of being on-track. Coefficients from elastic net models provided nuanced information to complement random forests results. In addition, incorporating detailed special education-related predictors served to improve classification accuracy, especially for students with disabilities. This study is filling a practical void in education to support the development of more sophisticated predictive models. This will be usable by researchers as an approach to ensure future EWSs work optimally. It is also an opportunity for practitioners to leverage new knowledge about students who are at-risk, and to test interventions at many levels in an attempt to improve graduation outcomes. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: High Schools; Secondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Connecticut
Grant or Contract Numbers: N/A
Author Affiliations: N/A