ERIC Number: ED670832
Record Type: Non-Journal
Publication Date: 2024-Jul
Pages: 8
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: 0000-00-00
Using Publicly Available Auxiliary Data to Improve Precision of Treatment Effect Estimation in a Randomized Efficacy Trial
Charlotte Z. Mann1; Jiaying Wang2; Adam Sales3; Johann A. Gagnon-Bartsch1
Grantee Submission, Paper presented at the International Conference on Educational Data Mining (17th, Atlanta, GA, Jul 2024
The gold-standard for evaluating the effect of an educational intervention on student outcomes is running a randomized controlled trial (RCT). However, RCTs may often be small due to logistical considerations, and resulting treatment effect estimates may lack precision. Recent methods improve experimental precision by incorporating information from large, observational, auxiliary data sets. Specifically, predictions of the outcome of interest from a model fit on the auxiliary data can be used in covariate adjustment. Such auxiliary data, on students or schools not included in an RCT, but with similar characteristics, is often available for educational RCTs. This is the case for a trial evaluating the efficacy of the Cognitive Tutor Algebra I curriculum (CTAI), an alternative algebra curriculum that included a computerized tutoring system. The Texas Education Agency (TEA) provides publicly available data on thousands of schools across Texas, including the 44 schools randomized in the CTAI study as well as nearly 3,000 additional, auxiliary schools. We develop an auxiliary model predicting passing rates for a standardized test in mathematics, which flexibly incorporates the 5,000 covariates available in the TEA data through random forest modeling. We compare our approach, using these auxiliary model predictions, to more standard estimators of the effect of CTAI on schools' mathematics passing rates. We find that leveraging information from the auxiliary data increases precision beyond standard methods that rely on the experimental sample alone, even for this paired trial with a powerful baseline covariate. We additionally demonstrate that working with auxiliary information provides practical benefits for analysis, beyond this increased estimation precision. [This paper was published in: "Proceedings of the 17th International Conference on Educational Data Mining," edited by B. Paaßen and C. D. Epp, International Educational Data Mining Society, 2024, pp. 518-25.]
Descriptors: Intervention, Outcomes of Education, Randomized Controlled Trials, Data Use, Goodness of Fit, Algebra, Mathematics Instruction, Intelligent Tutoring Systems, Instructional Effectiveness, Prediction, Mathematics Curriculum, Models, Comparative Analysis, Mathematics Achievement, Middle School Students, High School Students, Standardized Tests, Mathematics Tests, Data Interpretation
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: Junior High Schools; Middle Schools; Secondary Education; High Schools
Audience: N/A
Language: English
Sponsor: Institute of Education Sciences (ED); National Science Foundation (NSF), Division of Mathematical Sciences (DMS)
Authoring Institution: N/A
Identifiers - Location: Texas
IES Funded: Yes
Grant or Contract Numbers: R305D210031; 1646108
Department of Education Funded: Yes