ERIC - Search Results

Publication Date

In 2026	0
Since 2025	6
Since 2022 (last 5 years)	28
Since 2017 (last 10 years)	228
Since 2007 (last 20 years)	596

Descriptor

Statistical Analysis	852
Test Items	852
Item Response Theory	228
Foreign Countries	219
Item Analysis	198
Difficulty Level	177
Test Construction	176
Comparative Analysis	167
Scores	153
Test Bias	143
Correlation	127
Test Validity	120
Test Reliability	108
Multiple Choice Tests	104
Models	98
Psychometrics	98
Goodness of Fit	81
Simulation	79
English (Second Language)	75
Factor Analysis	72
Language Tests	72
Test Format	71
Achievement Tests	68
Computation	68
Computer Assisted Testing	65
More ▼

Education Level

Higher Education	184
Postsecondary Education	145
Secondary Education	95
Elementary Education	68
Middle Schools	42
High Schools	38
Junior High Schools	32
Grade 8	27
Elementary Secondary Education	22
Grade 4	14
Grade 5	13
Grade 6	13
Intermediate Grades	13
Early Childhood Education	10
Grade 3	10
Grade 7	10
Grade 9	9
Primary Education	7
Grade 12	5
Preschool Education	5
Grade 11	3
Grade 2	3
Kindergarten	3
Grade 1	2
Grade 10	2
More ▼

Audience

Researchers	30
Practitioners	3
Teachers	3
Policymakers	1

Location

Turkey	31
Germany	15
Australia	13
Canada	11
Netherlands	11
Japan	9
Taiwan	8
United States	8
Israel	7
Sweden	7
California	6
China	6
Nigeria	6
Singapore	6
Florida	5
India	4
Iran	4
Massachusetts	4
Minnesota	4
Texas	4
United Kingdom	4
United Kingdom (England)	4
Belgium	3
Colorado	3
Hong Kong	3
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 852 results Save | Export

A Comparison of Anchor Selection Strategies for DIF Analysis

Peer reviewed

Direct link

Haeju Lee; Kyung Yong Kim – Journal of Educational Measurement, 2025

When no prior information of differential item functioning (DIF) exists for items in a test, either the rank-based or iterative purification procedure might be preferred. The rank-based purification selects anchor items based on a preliminary DIF test. For a preliminary DIF test, likelihood ratio test (LRT) based approaches (e.g.,…

Descriptors: Test Items, Equated Scores, Test Bias, Accuracy

Simultaneous Linear Equating for Scenarios with Optional Test Versions or across Multiple Alternative Anchors

Peer reviewed
PDF on ERIC

Download full text

Tom Benton – Practical Assessment, Research & Evaluation, 2025

This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…

Descriptors: Equated Scores, Test Format, Test Items, Computation

Parametric Bootstrap Mantel-Haenszel Statistic for Aggregated Testlet Effects

Peer reviewed

Direct link

Youn Seon Lim – Journal of Educational Measurement, 2025

While testlets have proven useful for assessing complex skills, the stem shared by multiple items often induces correlations between responses, leading to violations of local independence (LI), which can result in biased parameter and ability estimates. Diagnostic procedures for detecting testlet effects typically involve model comparisons testing…

Descriptors: Sampling, Statistical Inference, Tests, Statistical Analysis

Explaining Person-by-Item Responses Using Person- and Item-Level Predictors via Random Forests and Interpretable Machine Learning in Explanatory Item Response Models

Peer reviewed

Direct link

Sun-Joo Cho; Goodwin Amanda; Jorge Salas; Sophia Mueller – Grantee Submission, 2025

This study incorporates a random forest (RF) approach to probe complex interactions and nonlinearity among predictors into an item response model with the goal of using a hybrid approach to outperform either an RF or explanatory item response model (EIRM) only in explaining item responses. In the specified model, called EIRM-RF, predicted values…

Descriptors: Item Response Theory, Artificial Intelligence, Statistical Analysis, Predictor Variables

Assessing Model Fit of the Generalized Graded Unfolding Model

Peer reviewed
PDF on ERIC

Download full text

Abdulla Alzarouni; R. J. De Ayala – Practical Assessment, Research & Evaluation, 2025

The assessment of model fit in latent trait modeling is an integral part of correctly applying the model. Still the assessment of model fit has been less utilized for ideal point models such as the Generalized Graded Unfolding Models (GGUM). The current study assesses the performance of the relative fit indices "AIC" and "BIC,"…

Descriptors: Goodness of Fit, Models, Statistical Analysis, Sample Size

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

An Introduction to Statistical Techniques Used for Detecting Anomaly in Test Results

Peer reviewed

Direct link

He, Qingping; Meadows, Michelle; Black, Beth – Research Papers in Education, 2022

A potential negative consequence of high-stakes testing is inappropriate test behaviour involving individuals and/or institutions. Inappropriate test behaviour and test collusion can result in aberrant response patterns and anomalous test scores and invalidate the intended interpretation and use of test results. A variety of statistical techniques…

Descriptors: Statistical Analysis, High Stakes Tests, Scores, Response Style (Tests)

Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints

Peer reviewed

Direct link

Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023

This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…

Descriptors: Classification, Models, Science Tests, Physics

Reevaluating the SIBTEST Classification Heuristics for Dichotomous Differential Item Functioning

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022

A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…

Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis

Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022

One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…

Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis

Finding the Right Grain-Size for Measurement in the Classroom

Peer reviewed

Direct link

Mark Wilson – Journal of Educational and Behavioral Statistics, 2024

This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational…

Descriptors: Educational Assessment, Measurement, Standardized Tests, Test Items

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Designing and Evaluating Tasks to Measure Individual Differences in Experimental Psychology: A Tutorial

Peer reviewed

Direct link

Marc Brysbaert – Cognitive Research: Principles and Implications, 2024

Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…

Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 57

Educational and Psychological…	77
ETS Research Report Series	52
ProQuest LLC	34
Applied Psychological…	31
Journal of Educational…	30
Journal of Educational and…	26
Applied Measurement in…	23
International Journal of…	13
Online Submission	12
Psychometrika	12
Grantee Submission	11
Language Testing	11
Practical Assessment,…	11
Chemistry Education Research…	10
International Journal of…	9
Language Assessment Quarterly	9
Educational Measurement:…	8
International Journal of…	8
Journal of Education and…	8
Journal of Psychoeducational…	8
CBE - Life Sciences Education	7
College Entrance Examination…	7
Educational Testing Service	6
Eurasian Journal of…	6
International Journal of…	6
More ▼

Sinharay, Sandip	14
Dorans, Neil J.	8
von Davier, Alina A.	7
Guo, Hongwen	6
Holland, Paul W.	6
Raykov, Tenko	6
Chang, Hua-Hua	5
Kim, Sooyeon	5
Liu, Jinghua	5
Livingston, Samuel A.	5
Magis, David	5
Marcoulides, George A.	5
Reckase, Mark D.	5
Wainer, Howard	5
Wilson, Mark	5
De Boeck, Paul	4
DeMars, Christine E.	4
Dimitrov, Dimiter M.	4
Feigenbaum, Miriam	4
Lee, Yi-Hsuan	4
Robin, Frederic	4
Robitzsch, Alexander	4
Suh, Youngsuk	4
Tindal, Gerald	4
More ▼

Reports - Research	651
Journal Articles	625
Reports - Evaluative	106
Speeches/Meeting Papers	79
Tests/Questionnaires	47
Reports - Descriptive	37
Dissertations/Theses -…	35
Numerical/Quantitative Data	17
Opinion Papers	8
Information Analyses	6
Guides - Non-Classroom	4
Reports - General	3
Collected Works - General	2
Collected Works - Proceedings	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - Classroom - Teacher	2
Book/Product Reviews	1
Books	1
Guides - Classroom - Learner	1
Guides - General	1
Reference Materials -…	1
More ▼

SAT (College Admission Test)	23
Program for International…	15
Test of English as a Foreign…	14
Trends in International…	13
Graduate Record Examinations	11
National Assessment of…	9
ACT Assessment	5
Comprehensive Tests of Basic…	4
Iowa Tests of Basic Skills	4
Law School Admission Test	4
Stanford Binet Intelligence…	3
Test of English for…	3
Advanced Placement…	2
Beginning Postsecondary…	2
Florida Comprehensive…	2
International English…	2
Raven Advanced Progressive…	2
United States Medical…	2
Wechsler Adult Intelligence…	2
ACT Interest Inventory	1
Armed Services Vocational…	1
Block Design Test	1
Boehm Test of Basic Concepts	1
California Achievement Tests	1
College Level Examination…	1
More ▼