ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	6

Descriptor

Reliability	13
Sample Size	13
Simulation	13
Mathematical Models	5
Test Items	5
Error of Measurement	4
Item Analysis	4
Sampling	4
Comparative Analysis	3
Factor Analysis	3
Item Response Theory	3
Models	3
Monte Carlo Methods	3
Bias	2
Correlation	2
Difficulty Level	2
Equated Scores	2
Goodness of Fit	2
Latent Trait Theory	2
Psychometrics	2
Regression (Statistics)	2
Research Problems	2
Statistical Analysis	2
Statistical Inference	2
Test Length	2
More ▼

Source

Educational and Psychological…	3
American Journal of…	1
Applied Measurement in…	1
International Educational…	1
Measurement:…	1

Publication Type

Reports - Research	9
Journal Articles	6
Speeches/Meeting Papers	5
Reports - Evaluative	4

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Toward Sufficient Statistical Power in Algorithmic Bias Assessment: A Test for ABROCA

Peer reviewed
PDF on ERIC

Download full text

Conrad Borchers – International Educational Data Mining Society, 2025

Algorithmic bias is a pressing concern in educational data mining (EDM), as it risks amplifying inequities in learning outcomes. The Area Between ROC Curves (ABROCA) metric is frequently used to measure discrepancies in model performance across demographic groups to quantify overall model fairness. However, its skewed distribution--especially when…

Descriptors: Algorithms, Bias, Statistics, Simulation

Using Multiple Imputation to Account for the Uncertainty Due to Missing Data in the Context of Factor Retention

Peer reviewed

Direct link

Yan Xia; Selim Havan – Educational and Psychological Measurement, 2024

Although parallel analysis has been found to be an accurate method for determining the number of factors in many conditions with complete data, its application under missing data is limited. The existing literature recommends that, after using an appropriate multiple imputation method, researchers either apply parallel analysis to every imputed…

Descriptors: Data Interpretation, Factor Analysis, Statistical Inference, Research Problems

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

A Comparison of Reliability Estimation Based on Confirmatory Factor Analysis and Exploratory Structural Equation Models

Peer reviewed

Direct link

Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2022

Composite reliability, or coefficient omega, can be estimated using structural equation modeling. Composite reliability is usually estimated under the basic independent clusters model of confirmatory factor analysis (ICM-CFA). However, due to the existence of cross-loadings, the model fit of the exploratory structural equation model (ESEM) is…

Descriptors: Comparative Analysis, Structural Equation Models, Factor Analysis, Reliability

Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests

Peer reviewed

Direct link

Hula, William D.; Fergadiotis, Gerasimos; Martin, Nadine – American Journal of Speech-Language Pathology, 2012

Purpose: The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Method: Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from…

Descriptors: Sample Size, Guessing (Tests), Aphasia, Item Response Theory

Comparison of Alternative Models for Item Parameter Estimation with Small Samples.

Download full text

Parshall, Cynthia G.; Kromrey, Jeffrey D.; Chason, Walter M. – 1996

The benefits of item response theory (IRT) will only accrue to a testing program to the extent that model assumptions are met. Obtaining accurate item parameter estimates is a critical first step. However, the sample sizes required for stable parameter estimation are often difficult to obtain in practice, particularly for the more complex models.…

Descriptors: Comparative Analysis, Estimation (Mathematics), Item Response Theory, Models

The Impact of Outliers on Cronbach's Coefficient Alpha Estimate of Reliability: Visual Analogue Scales

Peer reviewed

Direct link

Liu, Yan; Zumbo, Bruno D. – Educational and Psychological Measurement, 2007

The impact of outliers on Cronbach's coefficient [alpha] has not been documented in the psychometric or statistical literature. This is an important gap because coefficient [alpha] is the most widely used measurement statistic in all of the social, educational, and health sciences. The impact of outliers on coefficient [alpha] is investigated for…

Descriptors: Psychometrics, Computation, Reliability, Monte Carlo Methods

The Effects of Base Rate, Selection Ratio, Sample Size, and Reliability of Predictors on Predictive Efficiency Indices Associated with Logistic Regression Models.

Download full text

Soderstrom, Irina R.; Leitner, Dennis W. – 1997

While it is imperative that attempts be made to assess the predictive accuracy of any prediction model, traditional measures of predictive accuracy have been criticized as suffering from "the base rate problem." The base rate refers to the relative frequency of occurrence of the event being studied in the population of interest, and the…

Descriptors: Mathematical Models, Monte Carlo Methods, Prediction, Regression (Statistics)

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

Accuracy of Estimating Two Parameter Logistic Latent Trait Parameters and Implications for Classroom Tests.

Download full text

Kolen, Michael J.; Whitney, Douglas R. – 1978

The application of latent trait theory to classroom tests necessitates the use of small sample sizes for parameter estimation. Computer generated data were used to assess the accuracy of estimation of the slope and location parameters in the two parameter logistic model with fixed abilities and varying small sample sizes. The maximum likelihood…

Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models

A Comparison of the One- and Three-Parameter Logistic Models for Item Calibration.

Download full text

Reckase, Mark D. – 1978

Five comparisons were made relative to the quality of estimates of ability parameters and item calibrations obtained from the one-parameter and three-parameter logistic models. The results indicate: (1) The three-parameter model fit the test data better in all cases than did the one-parameter model. For simulation data sets, multi-factor data were…

Descriptors: Comparative Analysis, Goodness of Fit, Item Analysis, Mathematical Models

Item Characteristic Curve Parameters: Effects of Sample Size on Linear Equating.

Download full text

Ree, Malcom James; Jensen, Harald E. – 1980

By means of computer simulation of test responses, the reliability of item analysis data and the accuracy of equating were examined for hypothetical samples of 250, 500, 1000, and 2000 subjects for two tests with 20 equating items plus 60 additional items on the same scale. Birnbaum's three-parameter logistic model was used for the simulation. The…

Descriptors: Computer Assisted Testing, Equated Scores, Error of Measurement, Item Analysis

The Use of Invariance and Bootstrap Procedures as a Method to Establish the Reliability of Research Results.

Sandler, Andrew B. – 1987

Statistical significance is misused in educational and psychological research when it is applied as a method to establish the reliability of research results. Other techniques have been developed which can be correctly utilized to establish the generalizability of findings. Methods that do provide such estimates are known as invariance or…

Descriptors: Analysis of Covariance, Analysis of Variance, Correlation, Discriminant Analysis

Chason, Walter M.	1
Conrad Borchers	1
Fergadiotis, Gerasimos	1
Fitzpatrick, Anne R.	1
Fu, Yuanshu	1
Hula, William D.	1
Jensen, Harald E.	1
Kolen, Michael J.	1
Kromrey, Jeffrey D.	1
Leitner, Dennis W.	1
Liu, Yan	1
Martin, Nadine	1
Novak, Josip	1
Parshall, Cynthia G.	1
Rebernjak, Blaž	1
Reckase, Mark D.	1
Ree, Malcom James	1
Sandler, Andrew B.	1
Selim Havan	1
Soderstrom, Irina R.	1
Wang, Yang	1
Wen, Zhonglin	1
Whitney, Douglas R.	1
Yan Xia	1
Yen, Wendy M.	1
More ▼