ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	34

Descriptor

Effect Size	45
Sample Size	45
Simulation	45
Statistical Analysis	16
Meta Analysis	12
Error of Measurement	11
Item Response Theory	9
Monte Carlo Methods	9
Sampling	9
Statistical Bias	9
Statistical Significance	8
Comparative Analysis	7
Correlation	7
Regression (Statistics)	7
Test Bias	7
Evaluation Methods	6
Item Analysis	6
Research Methodology	6
Robustness (Statistics)	6
Test Items	6
Computation	5
Models	5
Classification	4
Educational Research	4
Probability	4
More ▼

Source

Educational and Psychological…	15
Journal of Experimental…	5
Journal of Educational and…	3
ProQuest LLC	3
Journal of Educational…	2
Society for Research on…	2
Structural Equation Modeling:…	2
Advances in Health Sciences…	1
Applied Measurement in…	1
Assessment & Evaluation in…	1
Educational Psychology Review	1
International Educational…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Psychological Methods	1
Psychometrika	1
Research Synthesis Methods	1
More ▼

Publication Type

Journal Articles	37
Reports - Research	28
Reports - Evaluative	12
Dissertations/Theses -…	3
Speeches/Meeting Papers	3
Information Analyses	2
Reports - Descriptive	2

Education Level

Higher Education	3
Postsecondary Education	3
Elementary Secondary Education	1

Audience

Location

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

International Adult Literacy…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 45 results Save | Export

Toward Sufficient Statistical Power in Algorithmic Bias Assessment: A Test for ABROCA

Peer reviewed
PDF on ERIC

Download full text

Conrad Borchers – International Educational Data Mining Society, 2025

Algorithmic bias is a pressing concern in educational data mining (EDM), as it risks amplifying inequities in learning outcomes. The Area Between ROC Curves (ABROCA) metric is frequently used to measure discrepancies in model performance across demographic groups to quantify overall model fairness. However, its skewed distribution--especially when…

Descriptors: Algorithms, Bias, Statistics, Simulation

Implementing a Standardized Effect Size in the POLYSIBTEST Procedure

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023

A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…

Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis

Absence of Evidence Is Not Evidence of Absence. On the Limited Use of Regression Discontinuity Analysis in Higher Education

Peer reviewed

Direct link

van Dorresteijn, Chevy; Kan, Kees-Jan; Smits, Niels – Assessment & Evaluation in Higher Education, 2023

When higher education students are assessed multiple times, teachers need to consider how these assessments can be combined into a single pass or fail decision. A common question that arises is whether students should be allowed to take a resit. Previous research has found little to no clear learning benefits of resits and therefore suggested they…

Descriptors: College Students, Student Evaluation, Pretests Posttests, Regression (Statistics)

Comparing Factor Score Approaches to SEM in Multigroup Models with Small Samples

Peer reviewed

Direct link

Emma Somer; Carl Falk; Milica Miocevic – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Factor Score Regression (FSR) is increasingly employed as an alternative to structural equation modeling (SEM) in small samples. Despite its popularity in psychology, the performance of FSR in multigroup models with small samples remains relatively unknown. The goal of this study was to examine the performance of FSR, namely Croon's correction and…

Descriptors: Scores, Structural Equation Models, Comparative Analysis, Sample Size

A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning

Peer reviewed

Direct link

Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…

Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance

A Latent State Trait Model for Multilevel Mediation Analysis with Multiple Timepoints

Direct link

Lydia Bradford – ProQuest LLC, 2024

In randomized control trials (RCT), the recent focus has shifted to how an intervention yields positive results on its intended outcome. This aligns with the recent push of implementation science in healthcare (Bauer et al., 2015) but goes beyond this. RCTs have moved to evaluating the theoretical framing of the intervention as well as differing…

Descriptors: Hierarchical Linear Modeling, Mediation Theory, Randomized Controlled Trials, Research Design

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Does Coding Method Matter? An Examination of Propensity Score Methods When the Treatment Group Is Larger than the Comparison Group

Direct link

Beth A. Perkins – ProQuest LLC, 2021

In educational contexts, students often self-select into specific interventions (e.g., courses, majors, extracurricular programming). When students self-select into an intervention, systematic group differences may impact the validity of inferences made regarding the effect of the intervention. Propensity score methods are commonly used to reduce…

Descriptors: Probability, Causal Models, Evaluation Methods, Control Groups

Hierarchical Bayes Approach to Estimate the Treatment Effect for Randomized Controlled Trials

Peer reviewed

Direct link

Liang, Xinya; Kamata, Akihito; Li, Ji – Educational and Psychological Measurement, 2020

One important issue in Bayesian estimation is the determination of an effective informative prior. In hierarchical Bayes models, the uncertainty of hyperparameters in a prior can be further modeled via their own priors, namely, hyper priors. This study introduces a framework to construct hyper priors for both the mean and the variance…

Descriptors: Bayesian Statistics, Randomized Controlled Trials, Effect Size, Sampling

Conducting Gene Set Tests in Meta-Analyses of Transcriptome Expression Data

Peer reviewed

Direct link

Kosch, Robin; Jung, Klaus – Research Synthesis Methods, 2019

Research synthesis, eg, by meta-analysis, is more and more considered in the area of high-dimensional data from molecular research such as gene and protein expression data, especially because most studies and experiments are performed with very small sample sizes. In contrast to most clinical and epidemiological trials, raw data are often…

Descriptors: Genetics, Meta Analysis, Molecular Structure, Scientific Research

The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration

Peer reviewed

Direct link

McNeish, Daniel M.; Stapleton, Laura M. – Educational Psychology Review, 2016

Multilevel models are an increasingly popular method to analyze data that originate from a clustered or hierarchical structure. To effectively utilize multilevel models, one must have an adequately large number of clusters; otherwise, some model parameters will be estimated with bias. The goals for this paper are to (1) raise awareness of the…

Descriptors: Hierarchical Linear Modeling, Statistical Analysis, Sample Size, Effect Size

Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

Peer reviewed

Direct link

Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Small-Sample Adjustments for Tests of Moderators and Model Fit Using Robust Variance Estimation in Meta-Regression

Peer reviewed

Direct link

Tipton, Elizabeth; Pustejovsky, James E. – Journal of Educational and Behavioral Statistics, 2015

Meta-analyses often include studies that report multiple effect sizes based on a common pool of subjects or that report effect sizes from several samples that were treated with very similar research protocols. The inclusion of such studies introduces dependence among the effect size estimates. When the number of studies is large, robust variance…

Descriptors: Meta Analysis, Effect Size, Computation, Robustness (Statistics)

Multiple-Group Noncompensatory Differential Item Functioning in Raju's Differential Functioning of Items and Tests

Peer reviewed

Direct link

Oshima, T. C.; Wright, Keith; White, Nick – International Journal of Testing, 2015

Raju, van der Linden, and Fleer (1995) introduced a framework for differential functioning of items and tests (DFIT) for unidimensional dichotomous models. Since then, DFIT has been shown to be a quite versatile framework as it can handle polytomous as well as multidimensional models both at the item and test levels. However, DFIT is still limited…

Descriptors: Test Bias, Item Response Theory, Test Items, Simulation

Previous Page | Next Page »

Pages: 1 | 2 | 3

Tipton, Elizabeth	3
Beretvas, S. Natasha	2
Liang, Xinya	2
Pustejovsky, James E.	2
Algina, James	1
Allen, Nancy	1
Ames, Allison	1
Barcikowski, Robert S.	1
Beasley, T. Mark	1
Beth A. Perkins	1
Brooks, Gordon P.	1
Carl Falk	1
Carvajal, Jorge	1
Chan, Darius K.-S.	1
Cheung, Shu Fai	1
Conrad Borchers	1
Cook, David A.	1
Crawford, Brandon	1
Debelak, Rudolf	1
Emma Somer	1
Fan, Weihua	1
Ferron, John M.	1
Finch, Holmes	1
French, Brian F.	1
Furgol, Katherine E.	1
More ▼