ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	9

Descriptor

Item Analysis	12
Simulation	12
Test Items	7
Item Response Theory	5
Comparative Analysis	3
Computation	3
Computer Assisted Testing	3
Test Reliability	3
Classification	2
Computer Software	2
Educational Assessment	2
Equations (Mathematics)	2
Goodness of Fit	2
Models	2
Predictive Validity	2
Probability	2
Statistical Analysis	2
Structural Equation Models	2
Test Bias	2
Test Validity	2
Ability	1
Accuracy	1
Achievement Gains	1
Achievement Tests	1
Adaptive Testing	1
More ▼

Source

Journal of Educational and…	3
Applied Psychological…	1
Center for Education Data &…	1
College Board	1
Educational and Psychological…	1
IEEE Transactions on Learning…	1
Journal of Educational…	1
Practical Assessment,…	1
Psychological Methods	1
Structural Equation Modeling:…	1

Publication Type

Reports - Descriptive	12
Journal Articles	10

Education Level

Higher Education	3
Elementary Secondary Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Using Cumulative Sum Control Chart to Detect Aberrant Responses in Educational Assessments

Peer reviewed
PDF on ERIC

Download full text

Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023

Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…

Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory

Hybrid Maximum Clique Algorithm Using Parallel Integer Programming for Uniform Test Assembly

Peer reviewed

Direct link

Fuchimoto, Kazuma; Ishii, Takatoshi; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2022

Educational assessments often require uniform test forms, for which each test form has equivalent measurement accuracy but with a different set of items. For uniform test assembly, an important issue is the increase of the number of assembled uniform tests. Although many automatic uniform test assembly methods exist, the maximum clique algorithm…

Descriptors: Simulation, Efficiency, Test Items, Educational Assessment

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Assessing the "Rothstein Falsification Test": Does It Really Show Teacher Value-Added Models Are Biased? CEDR Working Paper No. 2012 1.3

Direct link

Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…

Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias

Computerized Classification Testing under the One-Parameter Logistic Response Model with Ability-Based Guessing

Peer reviewed

Direct link

Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011

The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…

Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability

SIMREL: Software for Coefficient Alpha and Its Confidence Intervals with Monte Carlo Studies

Peer reviewed

Direct link

Yurdugul, Halil – Applied Psychological Measurement, 2009

This article describes SIMREL, a software program designed for the simulation of alpha coefficients and the estimation of its confidence intervals. SIMREL runs on two alternatives. In the first one, if SIMREL is run for a single data file, it performs descriptive statistics, principal components analysis, and variance analysis of the item scores…

Descriptors: Intervals, Monte Carlo Methods, Computer Software, Factor Analysis

A Bayesian Method for Studying DIF: A Cautionary Tale Filled with Surprises and Delights

Peer reviewed

Direct link

Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard; Muller, Eric S. – Journal of Educational and Behavioral Statistics, 2008

In the course of screening a form of a medical licensing exam for items that function differentially (DIF) between men and women, the authors used the traditional Mantel-Haenszel (MH) statistic for initial screening and a Bayesian method for deeper analysis. For very easy items, the MH statistic unexpectedly often found DIF where there was none.…

Descriptors: Bayesian Statistics, Licensing Examinations (Professions), Medicine, Test Items

The Research behind the New SAT®. Research Summary RS-11

Download full text

Kobrin, Jennifer L.; Schmidt, Amy Elizabeth – College Board, 2007

This report provides a brief summary of the research projects that have been conducted to support the development of the new SAT.

Descriptors: College Entrance Examinations, Educational Research, Educational Change, Research Projects

Structural Equation Models of Latent Interactions: Evaluation of Alternative Estimation Strategies and Indicator Construction

Peer reviewed

Direct link

Marsh, Herbert W.; Wen, Zhonglin; Hau, Kit-Tai – Psychological Methods, 2004

Interactions between (multiple indicator) latent variables are rarely used because of implementation complexity and competing strategies. Based on 4 simulation studies, the traditional constrained approach performed more poorly than did 3 new approaches-unconstrained, generalized appended product indicator, and quasi-maximum-likelihood (QML). The…

Descriptors: Structural Equation Models, Item Analysis, Error Patterns, Computation

Generalized Path Analysis and Generalized Simultaneous Equations Model for Recursive Systems with Responses of Mixed Types

Peer reviewed

Direct link

Tsai, Tien-Lung; Shau, Wen-Yi; Hu, Fu-Chang – Structural Equation Modeling: A Multidisciplinary Journal, 2006

This article generalizes linear path analysis (PA) and simultaneous equations models (SiEM) to deal with mixed responses of different types in a recursive or triangular system. An efficient instrumental variable (IV) method for estimating the structural coefficients of a 2-equation partially recursive generalized path analysis (GPA) model and…

Descriptors: Structural Equation Models, Path Analysis, Simulation, Equations (Mathematics)

A Sharing Item Response Theory Model for Computerized Adaptive Testing

Peer reviewed

Direct link

Segall, Daniel O. – Journal of Educational and Behavioral Statistics, 2004

A new sharing item response theory (SIRT) model is presented that explicitly models the effects of sharing item content between informants and test takers. This model is used to construct adaptive item selection and scoring rules that provide increased precision and reduced score gains in instances where sharing occurs. The adaptive item selection…

Descriptors: Scoring, Item Analysis, Item Response Theory, Adaptive Testing

Bolsinova, Maria	1
Bradlow, Eric T.	1
Chaplin, Duncan	1
Fuchimoto, Kazuma	1
Goldhaber, Dan	1
Hau, Kit-Tai	1
Hu, Fu-Chang	1
Huang, Sheng-Yun	1
Ishii, Takatoshi	1
Keller, Lisa A.	1
Kobrin, Jennifer L.	1
Liaw, Yuan-Ling	1
Longford, Nicholas T.	1
Marsh, Herbert W.	1
Muller, Eric S.	1
Rutkowski, David	1
Rutkowski, Leslie	1
Schmidt, Amy Elizabeth	1
Segall, Daniel O.	1
Shau, Wen-Yi	1
Tijmstra, Jesper	1
Tsai, Tien-Lung	1
Ueno, Maomi	1
Wainer, Howard	1
Wan, Siyu	1
More ▼