Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 7 |
Descriptor
| Simulation | 15 |
| Test Construction | 15 |
| Test Format | 15 |
| Computer Assisted Testing | 7 |
| Test Items | 6 |
| Comparative Analysis | 5 |
| Equated Scores | 4 |
| Item Response Theory | 4 |
| Test Reliability | 4 |
| Adaptive Testing | 3 |
| Cutting Scores | 3 |
| More ▼ | |
Source
| Journal of Educational… | 2 |
| Measurement:… | 2 |
| Academic Medicine | 1 |
| ETS Research Report Series | 1 |
| Education and Information… | 1 |
| Journal of Psychoeducational… | 1 |
| ProQuest LLC | 1 |
| Studies in Educational… | 1 |
Author
| Babcock, Ben | 1 |
| Clyman, Stephen G. | 1 |
| Cui, Zhongmin | 1 |
| DeMars, Christine E. | 1 |
| Dogan, Nuri | 1 |
| Eignor, Daniel R. | 1 |
| Finch, Fredrick | 1 |
| Floyd, Harlee S. | 1 |
| Foertsch, Mary | 1 |
| Gelbal, Selahattin | 1 |
| Hambleton, Ronald K. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Research | 9 |
| Reports - Evaluative | 4 |
| Speeches/Meeting Papers | 4 |
| Dissertations/Theses -… | 1 |
| Reports - Descriptive | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design
Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023
Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…
Descriptors: Test Format, Equated Scores, Best Practices, Test Construction
Yigiter, Mahmut Sami; Dogan, Nuri – Measurement: Interdisciplinary Research and Perspectives, 2023
In recent years, Computerized Multistage Testing (MST), with their versatile benefits, have found themselves a wide application in large scale assessments and have increased their popularity. The fact that forms can be made ready before the exam application, such as a linear test, and that they can be adapted according to the test taker's ability…
Descriptors: Programming Languages, Monte Carlo Methods, Computer Assisted Testing, Test Format
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2016
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Descriptors: Cutting Scores, Psychometrics, Test Construction, Classification
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Tian, Feng – ProQuest LLC, 2011
There has been a steady increase in the use of mixed-format tests, that is, tests consisting of both multiple-choice items and constructed-response items in both classroom and large-scale assessments. This calls for appropriate equating methods for such tests. As Item Response Theory (IRT) has rapidly become mainstream as the theoretical basis for…
Descriptors: Item Response Theory, Comparative Analysis, Equated Scores, Statistical Analysis
Peer reviewedDeMars, Christine E. – Journal of Educational Measurement, 2003
Generated data to simulate multidimensionality resulting from including two or four subtopics on a test. DIMTEST analysis results suggest that including multiple topics, when they are commonly taught together, can lead to conceptual multidimensionality and mathematical multidimensionality. (SLD)
Descriptors: Curriculum, Simulation, Test Construction, Test Format
Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – 1998
Equating a test form to itself through a chain of equatings, commonly referred to as circular equating, has been widely used as a criterion to evaluate the adequacy of equating. This paper uses both analytical methods and simulation methods to show that this criterion is in general invalid in serving this purpose. For the random groups design done…
Descriptors: Equated Scores, Evaluation Methods, Heuristics, Sampling
Stocking, Martha L. – 1988
The construction of parallel editions of conventional tests for purposes of test security while maintaining score comparability has always been a recognized and difficult problem in psychometrics and test construction. The introduction of new modes of test construction, e.g., adaptive testing, changes the nature of the problem, but does not make…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Identification
Stone, Gregory Ethan – 1994
The quality of fit between the data and the measurement model is fundamental to any discussion of results. Fit has been the subject of inquiry since as early as the 1920s. Most early explorations concentrated on assessing global fit or subset fits on fixed length, traditional paper and pencil tests given as a single unit. The detection of aberrant…
Descriptors: Adaptive Testing, Computer Assisted Testing, Educational Assessment, Educational History
Peer reviewedSwaak, Janine; de Jong, Ton – Studies in Educational Evaluation, 1996
A way to assess knowledge acquired through simulation-based learning (intuitive knowledge) is presented. A "WHAT-IF" test item format is developed, and two pilot studies involving 74 college students responding to WHAT-IF items are described. The tests did tap improvement in learning, although test validity was only partially supportive.…
Descriptors: College Students, Computer Assisted Testing, Elementary Secondary Education, Higher Education
Peer reviewedClyman, Stephen G.; Orr, Nancy A. – Academic Medicine, 1990
The process proposed for the development and use of computer-based testing, including simulation and multiple-choice questions, as part of the National Board of Medical Examiners' certification sequence is outlined. Summary reports of first-phase pilot testing in six medical schools are appended. (MSE)
Descriptors: Computer Assisted Testing, Higher Education, Licensing Examinations (Professions), Medical Education
Eignor, Daniel R.; Hambleton, Ronald K. – 1979
The purpose of the investigation was to obtain some relationships among (1) test lengths, (2) shape of domain-score distributions, (3) advancement scores, and (4) several criterion-referenced test score reliability and validity indices. The study was conducted using computer simulation methods. The values of variables under study were set to be…
Descriptors: Comparative Analysis, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores
Finch, Fredrick; Foertsch, Mary – 1993
Performance assessment is reviewed as an emerging form of alternative assessment, focusing on how it has been defined in the research literature, the criteria for evaluating its authenticity, the measurement of process and product, and the link between assessment and instruction. Three important dimensions that must be considered in describing…
Descriptors: Alternative Assessment, Educational Assessment, Elementary Secondary Education, Evaluation Methods

Direct link
