ERIC - Search Results

Publication Date

In 2025	0
Since 2024	17
Since 2021 (last 5 years)	40

Source

Applied Measurement in…

Publication Type

Journal Articles	40
Reports - Research	37
Information Analyses	2
Tests/Questionnaires	2
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Secondary Education	7
Higher Education	6
Postsecondary Education	6
Elementary Education	4
High Schools	4
Junior High Schools	3
Middle Schools	3
Early Childhood Education	2
Elementary Secondary Education	2
Grade 11	2
Grade 2	2
Grade 3	2
Grade 8	2
Primary Education	2
Grade 1	1
Grade 10	1
Grade 12	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 9	1
Intermediate Grades	1
More ▼

Audience

Practitioners

Location

California	1
Canada	1
Virginia	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Measures of Academic Progress	1
National Assessment of…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 40 results Save | Export

Identifying Careless Responses in Computer-Adaptive Affective Surveys Using Person Fit Analysis

Peer reviewed

Direct link

Stefanie A. Wind; Beyza Aksu-Dunya – Applied Measurement in Education, 2024

Careless responding is a pervasive concern in research using affective surveys. Although researchers have considered various methods for identifying careless responses, studies are limited that consider the utility of these methods in the context of computer adaptive testing (CAT) for affective scales. Using a simulation study informed by recent…

Descriptors: Response Style (Tests), Computer Assisted Testing, Adaptive Testing, Affective Measures

The Impact of Non-Effortful Responding on Item and Person Parameters in Item-Pool Scaling Linking

Peer reviewed

Direct link

Yue Liu; Zhen Li; Hongyun Liu; Xiaofeng You – Applied Measurement in Education, 2024

Low test-taking effort of examinees has been considered a source of construct-irrelevant variance in item response modeling, leading to serious consequences on parameter estimation. This study aims to investigate how non-effortful response (NER) influences the estimation of item and person parameters in item-pool scale linking (IPSL) and whether…

Descriptors: Item Response Theory, Computation, Simulation, Responses

Impact of Violating Unidimensionality on Rasch Calibration for Mixed-Format Tests

Peer reviewed

Direct link

Chunyan Liu; Raja Subhiyah; Richard A. Feinberg – Applied Measurement in Education, 2024

Mixed-format tests that include both multiple-choice (MC) and constructed-response (CR) items have become widely used in many large-scale assessments. When an item response theory (IRT) model is used to score a mixed-format test, the unidimensionality assumption may be violated if the CR items measure a different construct from that measured by MC…

Descriptors: Test Format, Response Style (Tests), Multiple Choice Tests, Item Response Theory

Item and Test Characteristic Curves of Rank-2PL Models for Multidimensional Forced-Choice Questionnaires

Peer reviewed

Direct link

Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Applied Measurement in Education, 2024

A process is proposed to create the one-dimensional expected item characteristic curve (ICC) and test characteristic curve (TCC) for each trait in multidimensional forced-choice questionnaires based on the Rank-2PL (two-parameter logistic) item response theory models for forced-choice items with two or three statements. Some examples of ICC and…

Descriptors: Item Response Theory, Questionnaires, Measurement Techniques, Statistics

Item-Writing Guidelines on Response Option Placement: A Systematic Review

Peer reviewed

Direct link

Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024

Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…

Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items

Combining Nonparametric and Parametric Item Response Theory to Explore Data Quality: Illustrations and a Simulation Study

Peer reviewed

Direct link

Stefanie A. Wind; Benjamin Lugu – Applied Measurement in Education, 2024

Researchers who use measurement models for evaluation purposes often select models with stringent requirements, such as Rasch models, which are parametric. Mokken Scale Analysis (MSA) offers a theory-driven nonparametric modeling approach that may be more appropriate for some measurement applications. Researchers have discussed using MSA as a…

Descriptors: Item Response Theory, Data Analysis, Simulation, Nonparametric Statistics

Automated Scoring of Short-Answer Questions: A Progress Report

Peer reviewed

Direct link

Brian E. Clauser; Victoria Yaneva; Peter Baldwin; Le An Ha; Janet Mee – Applied Measurement in Education, 2024

Multiple-choice questions have become ubiquitous in educational measurement because the format allows for efficient and accurate scoring. Nonetheless, there remains continued interest in constructed-response formats. This interest has driven efforts to develop computer-based scoring procedures that can accurately and efficiently score these items.…

Descriptors: Computer Uses in Education, Artificial Intelligence, Scoring, Responses

Modeling Dimensions Converging at the Upper Anchor in Learning Progressions: An Example of Micro-Evolution

Peer reviewed

Direct link

Mingfeng Xue; Mark Wilson – Applied Measurement in Education, 2024

Multidimensionality is common in psychological and educational measurements. This study focuses on dimensions that converge at the upper anchor (i.e. the highest acquisition status defined in a learning progression) and compares different ways of dealing with them using the multidimensional random coefficients multinomial logit model and scale…

Descriptors: Learning Trajectories, Educational Assessment, Item Response Theory, Evolution

A Method of Empirical Q-Matrix Validation for Multidimensional Item Response Theory

Peer reviewed

Direct link

Marcelo Andrade da Silva; A. Corinne Huggins-Manley; Jorge Luis Bazán; Amber Benedict – Applied Measurement in Education, 2024

A Q-matrix is a binary matrix that defines the relationship between items and latent variables and is widely used in diagnostic classification models (DCMs), and can also be adopted in multidimensional item response theory (MIRT) models. The construction process of the Q-matrix is typically carried out by experts in the subject area of the items…

Descriptors: Q Methodology, Matrices, Item Response Theory, Educational Assessment

Comparing Examinee-Based and Response-Based Motivation Filtering Methods in Remote Low-Stakes Testing

Peer reviewed

Direct link

Sarah Alahmadi; Christine E. DeMars – Applied Measurement in Education, 2024

Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote testing conditions. To remove the effects of low effort levels in responses observed in remote low-stakes testing, several motivation filtering methods…

Descriptors: Multiple Choice Tests, Item Response Theory, College Students, Scores

Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing

Peer reviewed

Direct link

TsungHan Ho – Applied Measurement in Education, 2023

An operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in…

Descriptors: Bayesian Statistics, Regression (Statistics), Test Items, Pretesting

Between- versus Within-Examinee Variability in Test-Taking Effort and Test Emotions during a Low-Stakes Test

Peer reviewed

Direct link

Perkins, Beth A.; Pastor, Dena A.; Finney, Sara J. – Applied Measurement in Education, 2021

When tests are low stakes for examinees, meaning there are little to no personal consequences associated with test results, some examinees put little effort into their performance. To understand the causes and consequences of diminished effort, researchers correlate test-taking effort with other variables, such as test-taking emotions and test…

Descriptors: Response Style (Tests), Psychological Patterns, Testing, Differences

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Multi-Group Generalizations of SIBTEST and Crossing-SIBTEST

Peer reviewed

Direct link

Chalmers, R. Philip; Zheng, Guoguo – Applied Measurement in Education, 2023

This article presents generalizations of SIBTEST and crossing-SIBTEST statistics for differential item functioning (DIF) investigations involving more than two groups. After reviewing the original two-group setup for these statistics, a set of multigroup generalizations that support contrast matrices for joint tests of DIF are presented. To…

Descriptors: Test Bias, Test Items, Item Response Theory, Error of Measurement

Detecting Item Parameter Drift in Small Sample Rasch Equating

Peer reviewed

Direct link

Daniel Jurich; Chunyan Liu – Applied Measurement in Education, 2023

Screening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly…

Descriptors: High Stakes Tests, Sample Size, Item Response Theory, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3

Chunyan Liu	2
Jurich, Daniel	2
Lee, Won-Chan	2
Stefanie A. Wind	2
Wise, Steven L.	2
A. Corinne Huggins-Manley	1
Abu-Ghazalah, Rashid M.	1
Abulela, Mohammed A. A.	1
Alahmadi, Sarah	1
Amber Benedict	1
Barbara Schneider	1
Barry, Carol L.	1
Bauer, Malcolm I.	1
Benjamin Lugu	1
Beyza Aksu-Dunya	1
Bjermo, Jonas	1
Blanco, María Paz	1
Bova, Joe	1
Brian E. Clauser	1
Brian F. French	1
Carlos Monsalve	1
Carney, Michele	1
Carol Eckerly	1
Cavey, Laurie	1
Chalmers, R. Philip	1
More ▼

Item Response Theory	29
Test Items	25
Error of Measurement	9
Item Analysis	9
Models	9
Responses	9
Simulation	8
Computation	7
Multiple Choice Tests	7
Scores	7
Accuracy	6
Difficulty Level	6
Sample Size	6
Comparative Analysis	5
Equated Scores	5
Response Style (Tests)	5
Test Format	5
Bayesian Statistics	4
Educational Assessment	4
Guessing (Tests)	4
High Stakes Tests	4
Measurement Techniques	4
Scoring	4
Statistical Analysis	4
Test Construction	4
More ▼