Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Educational Measurement:… | 11 |
Author
Katz, Irvin R. | 2 |
Keehner, Madeleine | 2 |
Albanese, Mark A. | 1 |
April L. Zenisky | 1 |
Arslan, Burcu | 1 |
Dorans, Neil J. | 1 |
Downing, Steven M. | 1 |
Frey, Andreas | 1 |
Frisbie, David A. | 1 |
Gong, Tao | 1 |
Hartig, Johannes | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Evaluative | 4 |
Information Analyses | 3 |
Reports - Descriptive | 3 |
Reports - Research | 2 |
Speeches/Meeting Papers | 2 |
Education Level
Adult Education | 1 |
Elementary Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Arslan, Burcu; Jiang, Yang; Keehner, Madeleine; Gong, Tao; Katz, Irvin R.; Yan, Fred – Educational Measurement: Issues and Practice, 2020
Computer-based educational assessments often include items that involve drag-and-drop responses. There are different ways that drag-and-drop items can be laid out and different choices that test developers can make when designing these items. Currently, these decisions are based on experts' professional judgments and design constraints, rather…
Descriptors: Test Items, Computer Assisted Testing, Test Format, Decision Making
Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024
The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…
Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction
Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019
The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…
Descriptors: Affordances, Test Items, Test Format, Test Wiseness
Liu, Jinghua; Dorans, Neil J. – Educational Measurement: Issues and Practice, 2013
We make a distinction between two types of test changes: inevitable deviations from specifications versus planned modifications of specifications. We describe how score equity assessment (SEA) can be used as a tool to assess a critical aspect of construct continuity, the equivalence of scores, whenever planned changes are introduced to testing…
Descriptors: Tests, Test Construction, Test Format, Change
Frey, Andreas; Hartig, Johannes; Rupp, Andre A. – Educational Measurement: Issues and Practice, 2009
In most large-scale assessments of student achievement, several broad content domains are tested. Because more items are needed to cover the content domains than can be presented in the limited testing time to each individual student, multiple test forms or booklets are utilized to distribute the items to the students. The construction of an…
Descriptors: Measures (Individuals), Test Construction, Theory Practice Relationship, Design

Albanese, Mark A. – Educational Measurement: Issues and Practice, 1993
A comprehensive review is given of evidence, with a bearing on the recommendation to avoid use of complex multiple choice (CMC) items. Avoiding Type K items (four primary responses and five secondary choices) seems warranted, but evidence against CMC in general is less clear. (SLD)
Descriptors: Cues, Difficulty Level, Multiple Choice Tests, Responses

Downing, Steven M. – Educational Measurement: Issues and Practice, 1992
Research on true-false (TF), multiple-choice, and alternate-choice (AC) tests is reviewed, discussing strengths, weaknesses, and the usefulness in classroom and large-scale testing of each. Recommendations are made for improving use of AC items to overcome some of the problems associated with TF items. (SLD)
Descriptors: Comparative Analysis, Educational Research, Multiple Choice Tests, Objective Tests

Wainer, Howard – Educational Measurement: Issues and Practice, 1993
Some cautions are sounded for converting a linearly administered test to an adaptive format. Four areas are identified in which practices broadly used in traditionally constructed tests can have adverse effects if thoughtlessly adopted when a test is administered in an adaptive mode. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Educational Practices, Test Construction

Solano-Flores, Guillermo; Shavelson, Richard J. – Educational Measurement: Issues and Practice, 1997
Conceptual, practical, and logistical issues in the development of science performance assessments (SPAs) are discussed. The conceptual framework identifies task, response format, and scoring system as components, and conceives of SPAs as tasks that attempt to recreate conditions in which scientists work. Developing SPAs is a sophisticated effort…
Descriptors: Elementary Secondary Education, Performance Based Assessment, Science Education, Science Tests

Frisbie, David A. – Educational Measurement: Issues and Practice, 1992
Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)
Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests

Sireci, Stephen G. – Educational Measurement: Issues and Practice, 1997
Different methodologies for linking tests across languages are reviewed and evaluated, focusing on monolingual item response theory, bilingual group designs, and matched monolingual group designs. These methods, although not without weaknesses, are superior for promoting score comparability than methods that rely on translation or expert judgment…
Descriptors: Bilingualism, Comparative Analysis, Cross Cultural Studies, Educational Assessment