Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 7 |
Descriptor
| Scores | 12 |
| Simulation | 12 |
| Test Format | 12 |
| Comparative Analysis | 8 |
| Test Items | 7 |
| Item Response Theory | 6 |
| Computer Assisted Testing | 5 |
| Ability | 3 |
| Equated Scores | 3 |
| Error of Measurement | 3 |
| Estimation (Mathematics) | 3 |
| More ▼ | |
Source
| ETS Research Report Series | 2 |
| Journal of Educational… | 2 |
| ProQuest LLC | 2 |
| Education and Information… | 1 |
| Eurasian Journal of… | 1 |
Author
Publication Type
| Journal Articles | 6 |
| Reports - Evaluative | 5 |
| Reports - Research | 5 |
| Dissertations/Theses -… | 2 |
| Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Gurdil Ege, Hatice; Demir, Ergul – Eurasian Journal of Educational Research, 2020
Purpose: The present study aims to evaluate how the reliabilities computed using a, Stratified a, Angoff-Feldt, and Feldt-Raju estimators may differ when sample size (500, 1000, and 2000) and item type ratio of dichotomous to polytomous items (2:1; 1:1, 1:2) included in the scale are varied. Research Methods: In this study, Cronbach's a,…
Descriptors: Test Format, Simulation, Test Reliability, Sample Size
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2014
The purpose of this study was to investigate the potential impact of misrouting under a 2-stage multistage test (MST) design, which includes 1 routing and 3 second-stage modules. Simulations were used to create a situation in which a large group of examinees took each of the 3 possible MST paths (high, middle, and low). We compared differences in…
Descriptors: Comparative Analysis, Difficulty Level, Scores, Test Wiseness
Andrews, Benjamin James – ProQuest LLC, 2011
The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of…
Descriptors: Test Format, Advanced Placement, Simulation, True Scores
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
van der Ark, L. Andries; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational Measurement, 2008
Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the…
Descriptors: Cheating, Statistics, Test Format, Measures (Individuals)
Peer reviewedPommerich, Mary; Nicewander, W. Alan; Hanson, Bradley A. – Journal of Educational Measurement, 1999
Studied whether a group's average percent correct in a content domain could be accurately estimated for groups taking a single test form and not the entire domain of items. Evaluated six Item Response Theory-based domain score estimation methods through simulation and concluded they performed better than observed score on the form taken. (SLD)
Descriptors: Estimation (Mathematics), Groups, Item Response Theory, Scores
Pommerich, Mary; Nicewander, W. Alan – 1998
A simulation study was performed to determine whether a group's average percent correct in a content domain could be accurately estimated for groups taking a single test form and not the entire domain of items. Six Item Response Theory (IRT)-based domain score estimation methods were evaluated, under conditions of few items per content area per…
Descriptors: Ability, Estimation (Mathematics), Groups, Item Response Theory
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Pommerich, Mary; Nicewander, W. Alan – 1998
A simulation study was performed to determine whether a group's average percent correct in a content domain could be accurately estimated for groups taking a single test form and not the entire domain of items. Six Item Response Theory (IRT) -based domain score estimation methods were evaluated, under conditions of few items per content area per…
Descriptors: Ability, Estimation (Mathematics), Group Membership, Item Response Theory
Thomasson, Gary L. – 1997
Score comparability is important to those who take tests and those who use them. One important concept related to test score comparability is that of "equity," which is defined as existing when examinees are indifferent as to which of two alternate forms of a test they would prefer to take. By their nature, computerized adaptive tests…
Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing
Stocking, Martha L. – 1988
The construction of parallel editions of conventional tests for purposes of test security while maintaining score comparability has always been a recognized and difficult problem in psychometrics and test construction. The introduction of new modes of test construction, e.g., adaptive testing, changes the nature of the problem, but does not make…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Identification

Direct link
