Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 9 |
Descriptor
Difficulty Level | 16 |
Scoring | 16 |
Test Format | 16 |
Test Items | 14 |
Computer Assisted Testing | 5 |
Multiple Choice Tests | 5 |
Test Construction | 5 |
Test Reliability | 5 |
Foreign Countries | 4 |
Higher Education | 4 |
Achievement Tests | 3 |
More ▼ |
Source
Author
Akyildiz, Murat | 1 |
Anderson, Paul S. | 1 |
Barakat, Bilal Fouad | 1 |
Becker, Benjamin | 1 |
Betts, Joe | 1 |
Boyer, Michelle | 1 |
Boz Yuksekdag, Belgin | 1 |
Debeer, Dries | 1 |
Eckerly, Carol | 1 |
Engelhard, George, Jr. | 1 |
Frisbie, David A. | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Journal Articles | 7 |
Reports - Evaluative | 4 |
Speeches/Meeting Papers | 4 |
Reports - Descriptive | 2 |
Guides - Non-Classroom | 1 |
Information Analyses | 1 |
Tests/Questionnaires | 1 |
Education Level
Audience
Practitioners | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Gustafsson, Martin; Barakat, Bilal Fouad – Comparative Education Review, 2023
International assessments inform education policy debates, yet little is known about their floor effects: To what extent do they fail to differentiate between the lowest performers, and what are the implications of this? TIMSS, SACMEQ, and LLECE data are analyzed to answer this question. In TIMSS, floor effects have been reduced through the…
Descriptors: Achievement Tests, Elementary Secondary Education, International Assessment, Foreign Countries
Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…
Descriptors: Scoring, Test Items, Test Format, Raw Scores
Item Order and Speededness: Implications for Test Fairness in Higher Educational High-Stakes Testing
Becker, Benjamin; van Rijn, Peter; Molenaar, Dylan; Debeer, Dries – Assessment & Evaluation in Higher Education, 2022
A common approach to increase test security in higher educational high-stakes testing is the use of different test forms with identical items but different item orders. The effects of such varied item orders are relatively well studied, but findings have generally been mixed. When multiple test forms with different item orders are used, we argue…
Descriptors: Information Security, High Stakes Tests, Computer Security, Test Items
Karadag, Nejdet; Boz Yuksekdag, Belgin; Akyildiz, Murat; Ibileme, Ali Ihsan – Turkish Online Journal of Distance Education, 2021
The aim of this study is to determine students' opinions about open-ended question exam practice during 2018-2019 academic year for the following programs of Anadolu University Open Education System: Economy, Hospitality Management, Philosophy, History, Sociology, and Turkish Language and Literature. The study was designed as a quantitative study…
Descriptors: Test Format, Test Items, Response Style (Tests), Scoring
Keng, Leslie; Boyer, Michelle – National Center for the Improvement of Educational Assessment, 2020
ACT requested assistance from the National Center for the Improvement of Educational Assessment (Center for Assessment) to investigate declines of scores for states administering the ACT to its 11th grade students in 2018. This request emerged from conversations among state leaders, the Center for Assessment, and ACT in trying to understand the…
Descriptors: College Entrance Examinations, Scores, Test Score Decline, Educational Trends
Eckerly, Carol; Smith, Russell; Sowles, John – Practical Assessment, Research & Evaluation, 2018
The Discrete Option Multiple Choice (DOMC) item format was introduced by Foster and Miller (2009) with the intent of improving the security of test content. However, by changing the amount and order of the content presented, the test taking experience varies by test taker, thereby introducing potential fairness issues. In this paper we…
Descriptors: Culture Fair Tests, Multiple Choice Tests, Testing, Test Items
Schoen, Robert C.; Yang, Xiaotong; Liu, Sicong; Paek, Insu – Grantee Submission, 2017
The Early Fractions Test v2.2 is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test v2.2 is to serve as a measure of student outcomes in a randomized trial designed to estimate the effect of an educational…
Descriptors: Psychometrics, Mathematics Tests, Mathematics Achievement, Fractions
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Plake, Barbara S.; And Others – 1983
Differential test performance by undergraduate males and females enrolled in a developmental educational psychology course (n=167) was reported on a quantitative examination as a function of item arrangement. Males were expected to perform better than females on tests whose items arranged easy to hard. Plake and Ansorge (1982) speculated this may…
Descriptors: Difficulty Level, Feedback, Higher Education, Scoring

Frisbie, David A. – Educational Measurement: Issues and Practice, 1992
Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)
Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests
Kirisci, Levent; Hsu, Tse-Chi – 1992
A predictive adaptive testing (PAT) strategy was developed based on statistical predictive analysis, and its feasibility was studied by comparing PAT performance to those of the Flexilevel, Bayesian modal, and expected a posteriori (EAP) strategies in a simulated environment. The proposed adaptive test is based on the idea of using item difficulty…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Analysis, Computer Assisted Testing

Hyers, Albert D.; Anderson, Paul S. – 1991
Using matched pairs of geography questions, a new testing method for machine-scored fill-in-the-blank, multiple-digit testing (MDT) questions was compared to the traditional multiple-choice (MC) style. Data were from 118 matched or parallel test items for 4 tests from 764 college students of geography. The new method produced superior results when…
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Difficulty Level
Ward, William C.; And Others – 1986
The keylist format (rather than the conventional multiple-choice format) for item presentation provides a machine-scorable surrogate for a truly free-response test. In this format, the examinee is required to think of an answer, look it up in a long ordered list, and enter its number on an answer sheet. The introduction of keylist items into…
Descriptors: Analogy, Aptitude Tests, Construct Validity, Correlation
Wang, Lih Shing; Stansfield, Charles W. – 1988
The manual for administration of the Chinese Proficiency Test contains an overview of the program, including: (1) its history, content, and format; (2) its primary focus and uses; (3) administration procedures, including registration, ordering the test, reporting scores, and billing; (4) the interpretation of test scores based on normative data…
Descriptors: Chinese, Difficulty Level, Item Analysis, Language Proficiency
Previous Page | Next Page »
Pages: 1 | 2