ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Source

Applied Psychological…

Publication Type

Journal Articles	14
Reports - Evaluative	14
Reports - Research	1

Education Level

High Schools

Audience

Location

Israel (Tel Aviv)

Laws, Policies, & Programs

Assessments and Surveys

California Learning…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating across Two Subgroups of Examinees and Two Test Formats

Peer reviewed

Direct link

von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008

Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…

Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus

A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating.

Peer reviewed

Wang, Tianyou; Kolen, Michael J. – Applied Psychological Measurement, 1996

A quadratic curve test equating method for equating different test forms under a random-groups data collection design is proposed that equates the first three central moments of the test forms. When applied to real test data, the method performs as well as other equating methods. Procedures from implementing the test are described. (SLD)

Descriptors: Data Collection, Equated Scores, Standardized Tests, Test Construction

IRT Test Assembly Using Network-Flow Programming.

Peer reviewed

Armstrong, Ronald D.; Jones, Douglas H.; Kunce, Charles S. – Applied Psychological Measurement, 1998

Investigated the use of mathematical programming techniques to generate parallel test forms with passages and items based on item-response theory (IRT) using the Fundamentals of Engineering Examination. Generated four parallel test forms from the item bank of almost 1,100 items. Comparison with human-generated forms supports the mathematical…

Descriptors: Engineering, Item Banks, Item Response Theory, Test Construction

An Investigation of the Sampling Distributions of Equating Coefficients.

Peer reviewed

Baker, Frank B. – Applied Psychological Measurement, 1996

Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulations indicate that for the equating conditions studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained…

Descriptors: Equated Scores, Item Response Theory, Sampling, Statistical Distributions

Standard Errors of Levine Linear Equating.

Peer reviewed

Hanson, Bradley A.; And Others – Applied Psychological Measurement, 1993

The delta method was used to derive standard errors (SES) of the Levine observed score and Levine true score linear test equating methods using data from two test forms. SES derived without the normality assumption and bootstrap SES were very close. The situation with skewed score distributions is also discussed. (SLD)

Descriptors: Equated Scores, Equations (Mathematics), Error of Measurement, Sampling

A General Approach to Algorithmic Design of Fixed-Form Tests, Adaptive Tests, and Testlets.

Peer reviewed

Berger, Martijn P. F. – Applied Psychological Measurement, 1994

This paper focuses on similarities of optimal design of fixed-form tests, adaptive tests, and testlets within the framework of the general theory of optimal designs. A sequential design procedure is proposed that uses these similarities to obtain consistent estimates for the trait level distribution. (SLD)

Descriptors: Achievement Tests, Adaptive Testing, Algorithms, Estimation (Mathematics)

Complex Composites: Issues That Arise in Combining Different Modes of Assessment.

Peer reviewed

Wilson, Mark; Wang, Wen-chung – Applied Psychological Measurement, 1995

Data from the California Learning Assessment System mathematics assessment were used to examine issues that arise when scores from different assessment modes are combined. Multiple-choice, open-ended, and investigation items were combined in a test across three test forms. Results illustrate the difficulties faced in evaluating combined…

Descriptors: Educational Assessment, Equated Scores, Evaluation Methods, Item Response Theory

A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

Peer reviewed

Direct link

Yao, Lihua; Schwarz, Richard D. – Applied Psychological Measurement, 2006

Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…

Descriptors: Models, Item Response Theory, Markov Processes, Monte Carlo Methods

Equating Scores from Adaptive to Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2006

Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Equated Scores

Estimating Measures of Pass-Fail Reliability from Parallel Half-Tests.

Peer reviewed

Woodruff, David J.; Sawyer, Richard L. – Applied Psychological Measurement, 1989

Two methods--non-distributional and normal--are derived for estimating measures of pass-fail reliability. Both are based on the Spearman Brown formula and require only a single test administration. Results from a simulation (n=20,000 examinees) and a licensure examination (n=4,828 examinees) illustrate these methods. (SLD)

Descriptors: Equations (Mathematics), Estimation (Mathematics), Licensing Examinations (Professions), Measures (Individuals)

The Effect of Numbers of Experts and Common Items on Cutting Score Equivalents Based on Expert Judgment.

Peer reviewed

Norcini, John; And Others – Applied Psychological Measurement, 1991

Effects of numbers of experts (NOEs) and common items (CIs) on the scaling of cutting scores from expert judgments were studied for 11,917 physicians taking 2 forms of a medical specialty examination. Increasing NOEs and CIs reduced error; beyond 5 experts and 25 CIs, error differences were small. (SLD)

Descriptors: Comparative Testing, Cutting Scores, Equated Scores, Estimation (Mathematics)

Effects of Response Format on Diagnostic Assessment of Scholastic Achievement.

Peer reviewed

Birenbaum, Menucha; And Others – Applied Psychological Measurement, 1992

The effect of multiple-choice (MC) or open-ended (OE) response format on diagnostic assessment of algebra test performance was investigated with 231 eighth and ninth graders in Tel Aviv (Israel) using bug or rule space analysis. Both analyses indicated closer similarity between parallel OE subsets than between stem-equivalent OE and MC subsets.…

Descriptors: Algebra, Comparative Testing, Educational Assessment, Educational Diagnosis

On the Feasibility of Multiple Matching Tests--Variations on a Theme by Gulliksen.

Peer reviewed

Budescu, David V. – Applied Psychological Measurement, 1988

A multiple matching test--a 24-item Hebrew vocabulary test--was examined, in which distractors from several items are pooled into one list at the test's end. Construction of such tests was feasible. Reliability, validity, and reduction of random guessing were satisfactory when applied to data from 717 applicants to Israeli universities. (SLD)

Descriptors: College Applicants, Feasibility Studies, Foreign Countries, Guessing (Tests)

Test Format	14
Equated Scores	8
Item Response Theory	7
Test Construction	6
Item Banks	4
Multiple Choice Tests	4
Test Items	4
Comparative Testing	3
Estimation (Mathematics)	3
Evaluation Methods	3
Sampling	3
Statistical Distributions	3
Achievement Tests	2
Adaptive Testing	2
Educational Assessment	2
Equations (Mathematics)	2
Error of Measurement	2
Foreign Countries	2
Licensing Examinations…	2
Mathematics Tests	2
Scoring	2
Standardized Tests	2
Test Reliability	2
Advanced Placement	1
Advanced Placement Programs	1
More ▼

Armstrong, Ronald D.	1
Baker, Frank B.	1
Berger, Martijn P. F.	1
Birenbaum, Menucha	1
Budescu, David V.	1
Dorans, Neil J.	1
Hammond, Shelby	1
Hanson, Bradley A.	1
Jones, Douglas H.	1
Kolen, Michael J.	1
Kunce, Charles S.	1
Liu, Jinghua	1
Norcini, John	1
Sawyer, Richard L.	1
Schwarz, Richard D.	1
Wang, Tianyou	1
Wang, Wen-chung	1
Wilson, Christine	1
Wilson, Mark	1
Woodruff, David J.	1
Yao, Lihua	1
van der Linden, Wim J.	1
von Davier, Alina A.	1
More ▼