ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	19

Descriptor

Item Response Theory	53
Test Format	53
Test Items	26
Test Construction	18
Equated Scores	16
Scores	13
Multiple Choice Tests	8
Comparative Analysis	7
Estimation (Mathematics)	7
Foreign Countries	7
Mathematics Tests	7
Models	7
Ability	6
Achievement Tests	6
Difficulty Level	6
Error of Measurement	5
Evaluation Methods	5
Simulation	5
Adaptive Testing	4
Classification	4
Computer Assisted Testing	4
Educational Assessment	4
Goodness of Fit	4
Higher Education	4
Item Banks	4
More ▼

Source

Applied Psychological…	7
Journal of Educational…	6
Applied Measurement in…	4
Educational and Psychological…	3
Educational Measurement:…	2
Language Testing	2
Practical Assessment,…	2
Behavioral Research and…	1
International Association for…	1
Journal of Outcome Measurement	1
Journal of Vocational Behavior	1
Multivariate Behavioral…	1
Psychometrika	1
Sociological Methods &…	1
More ▼

Publication Type

Reports - Evaluative	53
Journal Articles	31
Speeches/Meeting Papers	10
Numerical/Quantitative Data	2
Guides - General	1
Reports - Research	1

Education Level

Elementary Secondary Education	4
Grade 8	4
Secondary Education	3
Elementary Education	2
Grade 4	2
Grade 5	2
High Schools	2
Middle Schools	2
Grade 1	1
Grade 11	1
Grade 2	1
Grade 3	1
Grade 6	1
Grade 7	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
More ▼

Audience

Location

Australia	1
Netherlands	1
Oregon	1
South Korea	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Law School Admission Test	2
Trends in International…	2
Armed Services Vocational…	1
California Learning…	1
Graduate Record Examinations	1
National Assessment of…	1
Peabody Picture Vocabulary…	1
SAT (College Admission Test)	1
Strong Interest Inventory	1

What Works Clearinghouse Rating

Showing 1 to 15 of 53 results Save | Export

Fused SDT/IRT Models for Mixed-Format Exams

Peer reviewed

Direct link

Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024

A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…

Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models

An Ordinal, Concept-Driven Approach to Measurement: The Lexical Scale

Peer reviewed

Direct link

Gerring, John; Pemstein, Daniel; Skaaning, Svend-Erik – Sociological Methods & Research, 2021

A key obstacle to measurement is the aggregation problem. Where indicators tap into common latent traits in theoretically meaningful ways, the problem may be solved by applying a data-informed ("inductive") measurement model, for example, factor analysis, structural equation models, or item response theory. Where they do not, researchers…

Descriptors: Test Construction, Measures (Individuals), Concept Formation, Social Science Research

A Method for Converting 4-Option Multiple-Choice Items to 3-Option Multiple-Choice Items without Re-Pretesting

Peer reviewed
PDF on ERIC

Download full text

Wolkowitz, Amanda A.; Foley, Brett; Zurn, Jared – Practical Assessment, Research & Evaluation, 2023

The purpose of this study is to introduce a method for converting scored 4-option multiple-choice (MC) items into scored 3-option MC items without re-pretesting the 3-option MC items. This study describes a six-step process for achieving this goal. Data from a professional credentialing exam was used in this study and the method was applied to 24…

Descriptors: Multiple Choice Tests, Test Items, Accuracy, Test Format

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Methods and Procedures: TIMSS 2019 Technical Report

Download full text

Martin, Michael O., Ed.; von Davier, Matthias, Ed.; Mullis, Ina V. S., Ed. – International Association for the Evaluation of Educational Achievement, 2020

The chapters in this online volume comprise the TIMSS & PIRLS International Study Center's technical report of the methods and procedures used to develop, implement, and report the results of TIMSS 2019. There were various technical challenges because TIMSS 2019 was the initial phase of the transition to eTIMSS, with approximately half the…

Descriptors: Foreign Countries, Elementary Secondary Education, Achievement Tests, International Assessment

Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

Peer reviewed

Direct link

Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012

Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement

Causes of Gender DIF on an EFL Language Test: A Multiple-Data Analysis over Nine Years

Peer reviewed

Direct link

Pae, Tae-Il – Language Testing, 2012

This study tracked gender differential item functioning (DIF) on the English subtest of the Korean College Scholastic Aptitude Test (KCSAT) over a nine-year period across three data points, using both the Mantel-Haenszel (MH) and item response theory likelihood ratio (IRT-LR) procedures. Further, the study identified two factors (i.e. reading…

Descriptors: Aptitude Tests, Academic Aptitude, Language Tests, Test Items

Do Questions Written in the Target Language Make Foreign Language Listening Comprehension Tests More Difficult?

Peer reviewed

Direct link

Filipi, Anna – Language Testing, 2012

The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…

Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries

Latent Class Analysis of Differential Item Functioning on the Peabody Picture Vocabulary Test-III

Peer reviewed

Direct link

Webb, Mi-young Lee; Cohen, Allan S.; Schwanenflugel, Paula J. – Educational and Psychological Measurement, 2008

This study investigated the use of latent class analysis for the detection of differences in item functioning on the Peabody Picture Vocabulary Test-Third Edition (PPVT-III). A two-class solution for a latent class model appeared to be defined in part by ability because Class 1 was lower in ability than Class 2 on both the PPVT-III and the…

Descriptors: Item Response Theory, Test Items, Test Format, Cognitive Ability

Gender Bias and Construct Validity in Vocational Interest Measurement: Differential Item Functioning in the Strong Interest Inventory

Peer reviewed

Direct link

Einarsdottir, Sif; Rounds, James – Journal of Vocational Behavior, 2009

Item response theory was used to address gender bias in interest measurement. Differential item functioning (DIF) technique, SIBTEST and DIMTEST for dimensionality, were applied to the items of the six General Occupational Theme (GOT) and 25 Basic Interest (BI) scales in the Strong Interest Inventory. A sample of 1860 women and 1105 men was used.…

Descriptors: Test Format, Females, Vocational Interests, Construct Validity

Gating Items: Definition, Significance, and Need for Further Study

Peer reviewed

Direct link

Judd, Wallace – Practical Assessment, Research & Evaluation, 2009

Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…

Descriptors: Test Items, Test Format, Evaluation, Item Analysis

Item-Level Comparative Analysis of Online and Paper Administrations of the Texas Assessment of Knowledge and Skills

Peer reviewed

Direct link

Keng, Leslie; McClarty, Katie Larsen; Davis, Laurie Laughlin – Applied Measurement in Education, 2008

This article describes a comparative study conducted at the item level for paper and online administrations of a statewide high stakes assessment. The goal was to identify characteristics of items that may have contributed to mode effects. Item-level analyses compared two modes of the Texas Assessment of Knowledge and Skills (TAKS) for up to four…

Descriptors: Computer Assisted Testing, Geometric Concepts, Grade 8, Comparative Analysis

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Peer reviewed

Direct link

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory

Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating across Two Subgroups of Examinees and Two Test Formats

Peer reviewed

Direct link

von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008

Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…

Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

van der Linden, Wim J.	3
Li, Yuan H.	2
Lissitz, Robert W.	2
Luecht, Richard M.	2
McKinley, Robert L.	2
Nicewander, W. Alan	2
Pommerich, Mary	2
Sireci, Stephen G.	2
Sykes, Robert C.	2
Wainer, Howard	2
Way, Walter D.	2
Yang, Wen-Ling	2
Armstrong, Ronald D.	1
Baker, Frank B.	1
Beguin, Anton A.	1
Boekkooi-Timminga, Ellen	1
Cartwright, Deborah	1
Chan, Chetwyn	1
Chang, Wei-Ching	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
Chiu, Shuwan	1
Choi, Jiwon	1
Cohen, Allan S.	1
More ▼