ERIC - Search Results

Publication Date

In 2025	0
Since 2024	5
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	28

Descriptor

Test Format	54
Test Items	31
Item Response Theory	18
Test Construction	14
Multiple Choice Tests	13
Mathematics Tests	10
Comparative Analysis	9
Computer Assisted Testing	7
Equated Scores	7
Higher Education	7
Scoring	7
High School Students	6
Item Analysis	6
Responses	6
Difficulty Level	5
Elementary Secondary Education	5
Objective Tests	5
Scores	5
Sex Differences	5
Test Length	5
College Students	4
Grade 8	4
High Schools	4
High Stakes Tests	4
Science Tests	4
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	54
Reports - Research	40
Reports - Evaluative	12
Information Analyses	2
Speeches/Meeting Papers	2
Reports - Descriptive	1

Education Level

Elementary Secondary Education	4
Grade 8	4
Elementary Education	3
High Schools	3
Higher Education	3
Middle Schools	3
Postsecondary Education	3
Secondary Education	3
Grade 4	2
Grade 5	2
Grade 7	2
Junior High Schools	2
Grade 10	1
Grade 11	1
Grade 3	1
Grade 6	1
More ▼

Audience

Location

Canada	1
Israel	1
Massachusetts	1
Spain	1
Texas	1
Turkey	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	2
SAT (College Admission Test)	2
Advanced Placement…	1
Massachusetts Comprehensive…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 54 results Save | Export

Automated Scoring of Short-Answer Questions: A Progress Report

Peer reviewed

Direct link

Brian E. Clauser; Victoria Yaneva; Peter Baldwin; Le An Ha; Janet Mee – Applied Measurement in Education, 2024

Multiple-choice questions have become ubiquitous in educational measurement because the format allows for efficient and accurate scoring. Nonetheless, there remains continued interest in constructed-response formats. This interest has driven efforts to develop computer-based scoring procedures that can accurately and efficiently score these items.…

Descriptors: Computer Uses in Education, Artificial Intelligence, Scoring, Responses

Are Online and Paper Tests Comparable? Evidence from Statewide K-12 Tests

Peer reviewed

Direct link

Ben Backes; James Cowan – Applied Measurement in Education, 2024

We investigate two research questions using a recent statewide transition from paper to computer-based testing: first, the extent to which test mode effects found in prior studies can be eliminated; and second, the degree to which online and paper assessments offer different information about underlying student ability. We first find very small…

Descriptors: Computer Assisted Testing, Test Format, Differences, Academic Achievement

Impact of Violating Unidimensionality on Rasch Calibration for Mixed-Format Tests

Peer reviewed

Direct link

Chunyan Liu; Raja Subhiyah; Richard A. Feinberg – Applied Measurement in Education, 2024

Mixed-format tests that include both multiple-choice (MC) and constructed-response (CR) items have become widely used in many large-scale assessments. When an item response theory (IRT) model is used to score a mixed-format test, the unidimensionality assumption may be violated if the CR items measure a different construct from that measured by MC…

Descriptors: Test Format, Response Style (Tests), Multiple Choice Tests, Item Response Theory

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Understanding and Interpreting Human Scoring

Peer reviewed

Direct link

Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020

This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…

Descriptors: Scoring, Test Format, Responses, Predictor Variables

Can Adaptive Testing Improve Test-Taking Experience? A Case Study on Educational Survey Assessment

Peer reviewed

Direct link

Yi-Hsuan Lee; Yue Jia – Applied Measurement in Education, 2024

Test-taking experience is a consequence of the interaction between students and assessment properties. We define a new notion, rapid-pacing behavior, to reflect two types of test-taking experience -- disengagement and speededness. To identify rapid-pacing behavior, we extend existing methods to develop response-time thresholds for individual items…

Descriptors: Adaptive Testing, Reaction Time, Item Response Theory, Test Format

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

Classification Consistency and Accuracy for Mixed-Format Tests

Peer reviewed

Direct link

Kim, Stella Y.; Lee, Won-Chan – Applied Measurement in Education, 2019

This study explores classification consistency and accuracy for mixed-format tests using real and simulated data. In particular, the current study compares six methods of estimating classification consistency and accuracy for seven mixed-format tests. The relative performance of the estimation methods is evaluated using simulated data. Study…

Descriptors: Classification, Reliability, Accuracy, Test Format

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Impact of Accumulated Error on Item Response Theory Pre-Equating with Mixed Format Tests

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert; Cook, Robert J.; Colvin, Kimberly F. – Applied Measurement in Education, 2016

The equating of tests is an essential process in high-stakes, large-scale testing conducted over multiple forms or administrations. By adjusting for differences in difficulty and placing scores from different administrations of a test on a common scale, equating allows scores from these different forms and administrations to be directly compared…

Descriptors: Item Response Theory, Equated Scores, Test Format, Testing Programs

An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

Peer reviewed

Direct link

Tao, Wei; Cao, Yi – Applied Measurement in Education, 2016

Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…

Descriptors: Item Response Theory, Equated Scores, Test Format, Models

A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S. – Applied Measurement in Education, 2015

Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…

Descriptors: Nonparametric Statistics, Goodness of Fit, Item Response Theory, Test Format

Bi-Factor MIRT Observed-Score Equating for Mixed-Format Tests

Peer reviewed

Direct link

Lee, Guemin; Lee, Won-Chan – Applied Measurement in Education, 2016

The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…

Descriptors: Test Format, Multidimensional Scaling, Item Response Theory, Equated Scores

Using Necessary Information to Identify Item Dependence in Passage-Based Reading Comprehension Tests

Peer reviewed

Direct link

Baldonado, Angela Argo; Svetina, Dubravka; Gorin, Joanna – Applied Measurement in Education, 2015

Applications of traditional unidimensional item response theory models to passage-based reading comprehension assessment data have been criticized based on potential violations of local independence. However, simple rules for determining dependency, such as including all items associated with a particular passage, may overestimate the dependency…

Descriptors: Reading Tests, Reading Comprehension, Test Items, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Lee, Won-Chan	4
Downing, Steven M.	3
DeMars, Christine E.	2
Haladyna, Thomas M.	2
Keller, Lisa A.	2
Lee, Guemin	2
Allalouf, Avi	1
Ansley, Timothy N.	1
Ascalon, M. Evelina	1
Baldonado, Angela Argo	1
Becker, Douglas F.	1
Ben Backes	1
Bennett, Randy Elliot	1
Berberoglu, Giray	1
Boulais, André-Philippe	1
Boyer, Michelle	1
Brian E. Clauser	1
Brown, Richard S.	1
Cao, Yi	1
Carlton, Sydell T.	1
Carney, Michele	1
Cavey, Laurie	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
More ▼