Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 6 |
Descriptor
| Difficulty Level | 7 |
| Simulation | 7 |
| Test Format | 7 |
| Test Items | 5 |
| Equated Scores | 4 |
| Comparative Analysis | 3 |
| Item Response Theory | 3 |
| Sample Size | 3 |
| Computer Assisted Testing | 2 |
| Error of Measurement | 2 |
| Guidelines | 2 |
| More ▼ | |
Source
| Assessment & Evaluation in… | 1 |
| ETS Research Report Series | 1 |
| Journal of Psychoeducational… | 1 |
| Practical Assessment,… | 1 |
| ProQuest LLC | 1 |
| Quality Assurance in… | 1 |
Author
| Becker, Benjamin | 1 |
| Debeer, Dries | 1 |
| Floyd, Harlee S. | 1 |
| Griffith, William D. | 1 |
| Inga Laukaityte | 1 |
| Kim, Sooyeon | 1 |
| Kárász, Judit T. | 1 |
| Li, Yuan H. | 1 |
| Marie Wiberg | 1 |
| Molenaar, Dylan | 1 |
| Moore, Courtney A. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 5 |
| Reports - Research | 4 |
| Dissertations/Theses -… | 1 |
| Reports - Descriptive | 1 |
| Reports - Evaluative | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Item Order and Speededness: Implications for Test Fairness in Higher Educational High-Stakes Testing
Becker, Benjamin; van Rijn, Peter; Molenaar, Dylan; Debeer, Dries – Assessment & Evaluation in Higher Education, 2022
A common approach to increase test security in higher educational high-stakes testing is the use of different test forms with identical items but different item orders. The effects of such varied item orders are relatively well studied, but findings have generally been mixed. When multiple test forms with different item orders are used, we argue…
Descriptors: Information Security, High Stakes Tests, Computer Security, Test Items
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2014
The purpose of this study was to investigate the potential impact of misrouting under a 2-stage multistage test (MST) design, which includes 1 routing and 3 second-stage modules. Simulations were used to create a situation in which a large group of examinees took each of the 3 possible MST paths (high, middle, and low). We compared differences in…
Descriptors: Comparative Analysis, Difficulty Level, Scores, Test Wiseness
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
Li, Yuan H.; Griffith, William D.; Tam, Hak P. – 1997
This study explores the relative merits of a potentially useful item response theory (IRT) linking design: using a single set of anchor items with fixed common item parameters (FCIP) during the calibration process. An empirical study was conducted to investigate the appropriateness of this linking design using 6 groups of students taking 6 forms…
Descriptors: Ability, Difficulty Level, Equated Scores, Error of Measurement

Peer reviewed
Direct link
