Evaluating Uncertainty: The Impact of the Sampling and Assessment Design on Statistical Inference in the Context of ILSA.

Diego Cortes; Dirk Hastedt; Sabine Meinck

Notes FAQ Contact Us

Back to results

Peer reviewed

Direct link

ERIC Number: EJ1469000

Record Type: Journal

Publication Date: 2025-Dec

Pages: 21

Abstractor: As Provided

ISBN: N/A

ISSN: N/A

EISSN: EISSN-2196-0739

Available Date: 2025-04-04

Evaluating Uncertainty: The Impact of the Sampling and Assessment Design on Statistical Inference in the Context of ILSA

Diego Cortes¹; Dirk Hastedt¹; Sabine Meinck¹

Large-scale Assessments in Education, v13 Article 10 2025

This paper informs users of data collected in international large-scale assessments (ILSA), by presenting argumentsunderlining the importance of considering two design features employed in these studies. We examine a commonmisconception stating that the uncertainty arising from the assessment design is negligible compared with that arisingfrom the sampling design. This misconception can lead to the erroneous conclusion that there is always a relatively lowrisk of ignoring the uncertainty arising from the assessment design when reporting estimates of population parameters. We use the design effect framework to assess the impact that the sampling and the assessment design have on theestimation. We first evaluate the loss in efficiency in the estimation of a population parameter attributable to each ofthe designs. We then examine whether knowledge about the effect of one design feature can justify any belief about theeffect of the other design feature. We repeat this examination across different parameters characterizing theachievement distribution in a population. We provide empirical results using data collected for PIRLS 2016. Our empirical results can be summarized in two general findings. First, when estimating mean achievement, the effectof the sampling design is often substantially larger than that of the assessment design. This finding might explain themisconception we try to address. However, we show that this is not true in all instances, and the magnitude of thedifference between both design effects is context dependent and hence not generalizable. Second, differences in designeffects become less predictable when estimating other parameters, e.g. the proportion of students reaching a certainthreshold in the achievement scale (i.e., benchmarks), or an association estimated using linear regression. This contribution underlines that accounting for all sources of uncertainty in the estimation is of paramount importanceto obtain credible inferences. We conclude that it is difficult to justify a priori the belief that the effect of the samplingdesign in the estimation is always greater than that of the assessment design.

Descriptors: Sampling, Research Design, Educational Assessment, Statistical Inference, International Assessment, Misconceptions, Achievement Tests, Foreign Countries, Grade 4, Reading Tests, Reading Achievement

Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link-springer-com.bibliotheek.ehb.be/

Publication Type: Journal Articles; Reports - Research

Education Level: Elementary Education; Grade 4; Intermediate Grades

Audience: N/A

Language: English

Sponsor: N/A

Authoring Institution: N/A

Identifiers - Assessments and Surveys: Progress in International Reading Literacy Study

Grant or Contract Numbers: N/A

Author Affiliations: ¹International Association for the Evaluation of Educational Achievement (IEA), Hamburg, Germany