ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	11

Descriptor

Test Format	34
Item Response Theory	17
Test Construction	13
Equated Scores	11
Test Items	10
Computer Assisted Testing	8
Adaptive Testing	7
Models	7
Multiple Choice Tests	7
Comparative Testing	6
Higher Education	6
Foreign Countries	5
Item Banks	5
Test Reliability	5
Estimation (Mathematics)	4
Mathematics Tests	4
Psychometrics	4
Rating Scales	4
Sampling	4
Test Validity	4
Achievement Tests	3
Cutting Scores	3
Educational Assessment	3
Evaluation Methods	3
Licensing Examinations…	3
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	34
Reports - Research	18
Reports - Evaluative	14
Reports - Descriptive	3
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	1
Grade 12	1
Grade 4	1
Grade 8	1
High Schools	1
Higher Education	1

Audience

Location

Australia	1
Canada	1
Israel (Tel Aviv)	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	2
ACT Assessment	1
Armed Services Vocational…	1
California Learning…	1
Differential Aptitude Test	1
National Assessment of…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 34 results Save | Export

Integrating Test-Form Formatting into Automated Test Assembly

Peer reviewed

Direct link

Diao, Qi; van der Linden, Wim J. – Applied Psychological Measurement, 2013

Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…

Descriptors: Automation, Test Construction, Test Format, Item Banks

The Performance of IRT Model Selection Methods with Mixed-Format Tests

Peer reviewed

Direct link

Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G. – Applied Psychological Measurement, 2012

When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…

Descriptors: Item Response Theory, Models, Selection, Criteria

The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy

Peer reviewed

Direct link

Wyse, Adam E. – Applied Psychological Measurement, 2011

In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…

Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics

A Monte Carlo Approach to the Design, Assembly, and Evaluation of Multistage Adaptive Tests

Peer reviewed

Direct link

Belov, Dmitry I.; Armstrong, Ronald D. – Applied Psychological Measurement, 2008

This article presents an application of Monte Carlo methods for developing and assembling multistage adaptive tests (MSTs). A major advantage of the Monte Carlo assembly over other approaches (e.g., integer programming or enumerative heuristics) is that it provides a uniform sampling from all MSTs (or MST paths) available from a given item pool.…

Descriptors: Monte Carlo Methods, Adaptive Testing, Sampling, Item Response Theory

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating across Two Subgroups of Examinees and Two Test Formats

Peer reviewed

Direct link

von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008

Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…

Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus

A Method for Estimating Classification Consistency Indices for Two Equated Forms

Peer reviewed

Direct link

Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007

Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…

Descriptors: Classification, Reliability, Indexes, Computation

Computerized Adaptive Testing for Polytomous Motivation Items: Administration Mode Effects and a Comparison with Short Forms

Peer reviewed

Direct link

Hol, A. Michiel; Vorst, Harrie C. M.; Mellenbergh, Gideon J. – Applied Psychological Measurement, 2007

In a randomized experiment (n = 515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible…

Descriptors: Student Motivation, Simulation, Adaptive Testing, Computer Assisted Testing

A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating.

Peer reviewed

Wang, Tianyou; Kolen, Michael J. – Applied Psychological Measurement, 1996

A quadratic curve test equating method for equating different test forms under a random-groups data collection design is proposed that equates the first three central moments of the test forms. When applied to real test data, the method performs as well as other equating methods. Procedures from implementing the test are described. (SLD)

Descriptors: Data Collection, Equated Scores, Standardized Tests, Test Construction

IRT Test Assembly Using Network-Flow Programming.

Peer reviewed

Armstrong, Ronald D.; Jones, Douglas H.; Kunce, Charles S. – Applied Psychological Measurement, 1998

Investigated the use of mathematical programming techniques to generate parallel test forms with passages and items based on item-response theory (IRT) using the Fundamentals of Engineering Examination. Generated four parallel test forms from the item bank of almost 1,100 items. Comparison with human-generated forms supports the mathematical…

Descriptors: Engineering, Item Banks, Item Response Theory, Test Construction

Computerized Adaptive Testing with Multiple-Form Structures

Peer reviewed

Direct link

Armstrong, Ronald D.; Jones, Douglas H.; Koppel, Nicole B.; Pashley, Peter J. – Applied Psychological Measurement, 2004

A multiple-form structure (MFS) is an ordered collection or network of testlets (i.e., sets of items). An examinee's progression through the network of testlets is dictated by the correctness of an examinee's answers, thereby adapting the test to his or her trait level. The collection of paths through the network yields the set of all possible…

Descriptors: Law Schools, Adaptive Testing, Computer Assisted Testing, Test Format

An Investigation of the Sampling Distributions of Equating Coefficients.

Peer reviewed

Baker, Frank B. – Applied Psychological Measurement, 1996

Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulations indicate that for the equating conditions studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained…

Descriptors: Equated Scores, Item Response Theory, Sampling, Statistical Distributions

Test Equating under the Multiple-Choice Model.

Peer reviewed

Kim, Jee-Seon; Hanson, Bradley A. – Applied Psychological Measurement, 2002

Presents a characteristic curve procedure for comparing transformations of the item response theory ability scale assuming the multiple-choice model. Illustrates the use of the method with an example equating American College Testing mathematics tests. (SLD)

Descriptors: Ability, Equated Scores, Item Response Theory, Mathematics Tests

Grouped versus Randomized Format: An Investigation of Scale Convergent and Discriminant Validity Using LISREL Confirmatory Factor Analysis.

Peer reviewed

Schriesheim, Chester A.; And Others – Applied Psychological Measurement, 1989

LISREL maximum likelihood confirmatory factor analyses assessed the effects of grouped and random formats on convergent and discriminant validity of two sets of questionnaires--job characteristics scales and satisfaction measures--each administered to 80 college students. The grouped format was superior, and the usefulness of LISREL confirmatory…

Descriptors: College Students, Higher Education, Measures (Individuals), Questionnaires

Computerization of Paper-and-Pencil Tests: When Are They Equivalent?

Peer reviewed

Neuman, George; Baydoun, Ramzi – Applied Psychological Measurement, 1998

Studied the cross-mode equivalence of paper-and-pencil and computer-based clerical tests with 141 undergraduates. Found no differences across modes for the two types of tests. Differences can be minimized when speeded computerized tests follow the same administration and response procedures as the paper format. (SLD)

Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Higher Education

Previous Page | Next Page »

Pages: 1 | 2 | 3

Armstrong, Ronald D.	3
Hanson, Bradley A.	2
Jones, Douglas H.	2
van der Linden, Wim J.	2
Baker, Frank B.	1
Barnes, Janet L.	1
Baydoun, Ramzi	1
Bell, Richard	1
Belov, Dmitry I.	1
Berger, Martijn P. F.	1
Birenbaum, Menucha	1
Brennan, Robert L.	1
Budescu, David V.	1
Budgell, Glen R.	1
Chang, Lei	1
Chang, Wanchen	1
Diao, Qi	1
Dodd, Barbara G.	1
Dorans, Neil J.	1
Hammond, Shelby	1
Harris, Deborah J.	1
Henly, Susan J.	1
Hol, A. Michiel	1
Hsu, Louis M.	1
More ▼