NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: EJ1320634
Record Type: Journal
Publication Date: 2021
Pages: 166
Abstractor: As Provided
ISBN: N/A
ISSN: EISSN-2157-2100
EISSN: N/A
Available Date: N/A
Stacked Ensemble Learning for Propensity Score Methods in Observational Studies
Autenrieth, Maximilian; Levine, Richard A.; Fan, Juanjuan; Guarcello, Maureen A.
Journal of Educational Data Mining, v13 n1 p24-189 2021
Propensity score methods account for selection bias in observational studies. However, the consistency of the propensity score estimators strongly depends on a correct specification of the propensity score model. Logistic regression and, with increasing popularity, machine learning tools are used to estimate propensity scores. We introduce a stacked generalization ensemble learning approach to improve propensity score estimation by fitting a meta learner on the predictions of a suitable set of diverse base learners. We perform a comprehensive Monte Carlo simulation study, implementing a broad range of scenarios that mimic characteristics of typical data sets in educational studies. The population average treatment effect is estimated using the propensity score in Inverse Probability of Treatment Weighting. Our proposed stacked ensembles, especially using gradient boosting machines as a meta learner trained on a set of 12 base learner predictions, led to superior reduction of bias compared to the current state-of-the-art in propensity score estimation. Further, our simulations imply that commonly used balance measures (averaged standardized absolute mean differences) might be misleading as propensity score model selection criteria. We apply our proposed model -- which we call GBM-Stack -- to assess the population average treatment effect of a Supplemental Instruction (SI) program in an introductory psychology (PSY 101) course at San Diego State University. Our analysis provides evidence that moving the whole population to SI attendance would on average lead to 1.69 times higher odds to pass the PSY 101 class compared to not offering SI, with a 95% bootstrap confidence interval of (1.31, 2.20).
International Educational Data Mining. e-mail: jedm.editor@gmail.com; Web site: https://jedm.educationaldatamining.org/index.php/JEDM
Publication Type: Journal Articles; Reports - Research; Numerical/Quantitative Data
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: California (San Diego)
Grant or Contract Numbers: N/A
Author Affiliations: N/A