Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 11 |
Descriptor
Test Format | 34 |
Item Response Theory | 17 |
Test Construction | 13 |
Equated Scores | 11 |
Test Items | 10 |
Computer Assisted Testing | 8 |
Adaptive Testing | 7 |
Models | 7 |
Multiple Choice Tests | 7 |
Comparative Testing | 6 |
Higher Education | 6 |
More ▼ |
Source
Applied Psychological… | 34 |
Author
Publication Type
Journal Articles | 34 |
Reports - Research | 18 |
Reports - Evaluative | 14 |
Reports - Descriptive | 3 |
Tests/Questionnaires | 1 |
Education Level
Elementary Secondary Education | 1 |
Grade 12 | 1 |
Grade 4 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Higher Education | 1 |
Audience
Location
Australia | 1 |
Canada | 1 |
Israel (Tel Aviv) | 1 |
Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Law School Admission Test | 2 |
ACT Assessment | 1 |
Armed Services Vocational… | 1 |
California Learning… | 1 |
Differential Aptitude Test | 1 |
National Assessment of… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Diao, Qi; van der Linden, Wim J. – Applied Psychological Measurement, 2013
Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…
Descriptors: Automation, Test Construction, Test Format, Item Banks
Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G. – Applied Psychological Measurement, 2012
When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…
Descriptors: Item Response Theory, Models, Selection, Criteria
Wyse, Adam E. – Applied Psychological Measurement, 2011
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics
Belov, Dmitry I.; Armstrong, Ronald D. – Applied Psychological Measurement, 2008
This article presents an application of Monte Carlo methods for developing and assembling multistage adaptive tests (MSTs). A major advantage of the Monte Carlo assembly over other approaches (e.g., integer programming or enumerative heuristics) is that it provides a uniform sampling from all MSTs (or MST paths) available from a given item pool.…
Descriptors: Monte Carlo Methods, Adaptive Testing, Sampling, Item Response Theory
Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008
This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…
Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods
von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008
Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…
Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus
Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007
Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…
Descriptors: Classification, Reliability, Indexes, Computation
Hol, A. Michiel; Vorst, Harrie C. M.; Mellenbergh, Gideon J. – Applied Psychological Measurement, 2007
In a randomized experiment (n = 515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible…
Descriptors: Student Motivation, Simulation, Adaptive Testing, Computer Assisted Testing

Wang, Tianyou; Kolen, Michael J. – Applied Psychological Measurement, 1996
A quadratic curve test equating method for equating different test forms under a random-groups data collection design is proposed that equates the first three central moments of the test forms. When applied to real test data, the method performs as well as other equating methods. Procedures from implementing the test are described. (SLD)
Descriptors: Data Collection, Equated Scores, Standardized Tests, Test Construction

Armstrong, Ronald D.; Jones, Douglas H.; Kunce, Charles S. – Applied Psychological Measurement, 1998
Investigated the use of mathematical programming techniques to generate parallel test forms with passages and items based on item-response theory (IRT) using the Fundamentals of Engineering Examination. Generated four parallel test forms from the item bank of almost 1,100 items. Comparison with human-generated forms supports the mathematical…
Descriptors: Engineering, Item Banks, Item Response Theory, Test Construction
Armstrong, Ronald D.; Jones, Douglas H.; Koppel, Nicole B.; Pashley, Peter J. – Applied Psychological Measurement, 2004
A multiple-form structure (MFS) is an ordered collection or network of testlets (i.e., sets of items). An examinee's progression through the network of testlets is dictated by the correctness of an examinee's answers, thereby adapting the test to his or her trait level. The collection of paths through the network yields the set of all possible…
Descriptors: Law Schools, Adaptive Testing, Computer Assisted Testing, Test Format

Baker, Frank B. – Applied Psychological Measurement, 1996
Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulations indicate that for the equating conditions studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained…
Descriptors: Equated Scores, Item Response Theory, Sampling, Statistical Distributions

Kim, Jee-Seon; Hanson, Bradley A. – Applied Psychological Measurement, 2002
Presents a characteristic curve procedure for comparing transformations of the item response theory ability scale assuming the multiple-choice model. Illustrates the use of the method with an example equating American College Testing mathematics tests. (SLD)
Descriptors: Ability, Equated Scores, Item Response Theory, Mathematics Tests

Schriesheim, Chester A.; And Others – Applied Psychological Measurement, 1989
LISREL maximum likelihood confirmatory factor analyses assessed the effects of grouped and random formats on convergent and discriminant validity of two sets of questionnaires--job characteristics scales and satisfaction measures--each administered to 80 college students. The grouped format was superior, and the usefulness of LISREL confirmatory…
Descriptors: College Students, Higher Education, Measures (Individuals), Questionnaires

Neuman, George; Baydoun, Ramzi – Applied Psychological Measurement, 1998
Studied the cross-mode equivalence of paper-and-pencil and computer-based clerical tests with 141 undergraduates. Found no differences across modes for the two types of tests. Differences can be minimized when speeded computerized tests follow the same administration and response procedures as the paper format. (SLD)
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Higher Education