Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 13 |
Descriptor
Test Format | 33 |
Test Items | 33 |
Test Construction | 16 |
Multiple Choice Tests | 12 |
Comparative Analysis | 7 |
Computer Assisted Testing | 7 |
Higher Education | 7 |
Item Response Theory | 7 |
Scores | 7 |
Item Analysis | 6 |
Test Bias | 5 |
More ▼ |
Source
Journal of Educational… | 33 |
Author
van der Linden, Wim J. | 3 |
Debeer, Dries | 2 |
Adema, Jos J. | 1 |
Albano, Anthony D. | 1 |
Algina, James | 1 |
Ali, Usama S. | 1 |
Allalouf, Avi | 1 |
Askegaard, Lewis D. | 1 |
Baldwin, Peter | 1 |
Benson, Jeri | 1 |
Braun, Henry I. | 1 |
More ▼ |
Publication Type
Journal Articles | 33 |
Reports - Research | 24 |
Reports - Descriptive | 6 |
Reports - Evaluative | 3 |
Guides - Non-Classroom | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 2 |
Program for International… | 2 |
Advanced Placement… | 1 |
Mathematics Anxiety Rating… | 1 |
National Assessment of… | 1 |
State Trait Anxiety Inventory | 1 |
What Works Clearinghouse Rating
Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Journal of Educational Measurement, 2024
This paper presents the item and test information functions of the Rank two-parameter logistic models (Rank-2PLM) for items with two (pair) and three (triplet) statements in forced-choice questionnaires. The Rank-2PLM model for pairs is the MUPP-2PLM (Multi-Unidimensional Pairwise Preference) and, for triplets, is the Triplet-2PLM. Fisher's…
Descriptors: Questionnaires, Test Items, Item Response Theory, Models
Choe, Edison M.; Han, Kyung T. – Journal of Educational Measurement, 2022
In operational testing, item response theory (IRT) models for dichotomous responses are popular for measuring a single latent construct [theta], such as cognitive ability in a content domain. Estimates of [theta], also called IRT scores or [theta hat], can be computed using estimators based on the likelihood function, such as maximum likelihood…
Descriptors: Scores, Item Response Theory, Test Items, Test Format
Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022
While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…
Descriptors: Scoring, Testing, Test Items, Test Format
Guo, Wenjing; Wind, Stefanie A. – Journal of Educational Measurement, 2021
The use of mixed-format tests made up of multiple-choice (MC) items and constructed response (CR) items is popular in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district- and state-level assessments in the United States. Rater effects, or raters' scoring tendencies that result in…
Descriptors: Test Format, Multiple Choice Tests, Scoring, Test Items
Li, Jie; van der Linden, Wim J. – Journal of Educational Measurement, 2018
The final step of the typical process of developing educational and psychological tests is to place the selected test items in a formatted form. The step involves the grouping and ordering of the items to meet a variety of formatting constraints. As this activity tends to be time-intensive, the use of mixed-integer programming (MIP) has been…
Descriptors: Programming, Automation, Test Items, Test Format
Shear, Benjamin R. – Journal of Educational Measurement, 2023
Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…
Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis
Liu, Chen-Wei; Wang, Wen-Chung – Journal of Educational Measurement, 2017
The examinee-selected-item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set of items (e.g., choose one item to respond from a pair of items), always yields incomplete data (i.e., only the selected items are answered and the others have missing data) that are likely nonignorable. Therefore, using…
Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Data Analysis
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
van der Linden, Wim J.; Diao, Qi – Journal of Educational Measurement, 2011
In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…
Descriptors: Test Items, Test Format, Test Construction, Item Banks
Debeer, Dries; Janssen, Rianne – Journal of Educational Measurement, 2013
Changing the order of items between alternate test forms to prevent copying and to enhance test security is a common practice in achievement testing. However, these changes in item order may affect item and test characteristics. Several procedures have been proposed for studying these item-order effects. The present study explores the use of…
Descriptors: Item Response Theory, Test Items, Test Format, Models
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Penfield, Randall D.; Algina, James – Journal of Educational Measurement, 2006
One approach to measuring unsigned differential test functioning is to estimate the variance of the differential item functioning (DIF) effect across the items of the test. This article proposes two estimators of the DIF effect variance for tests containing dichotomous and polytomous items. The proposed estimators are direct extensions of the…
Descriptors: Test Bias, Test Format, Test Items, Simulation

DeMars, Christine E. – Journal of Educational Measurement, 2003
Generated data to simulate multidimensionality resulting from including two or four subtopics on a test. DIMTEST analysis results suggest that including multiple topics, when they are commonly taught together, can lead to conceptual multidimensionality and mathematical multidimensionality. (SLD)
Descriptors: Curriculum, Simulation, Test Construction, Test Format

van der Linden, Wim J.; Adema, Jos J. – Journal of Educational Measurement, 1998
Proposes an algorithm for the assembly of multiple test forms in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. Illustrates how the method can be implemented using 0-1 linear programming and gives two examples. (SLD)
Descriptors: Algorithms, Linear Programming, Test Construction, Test Format