ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	3

Descriptor

Test Format	9
Test Reliability	9
Test Validity	4
Comparative Analysis	3
Evaluation Methods	3
Test Construction	3
Test Items	3
Computer Simulation	2
Cutting Scores	2
Error of Measurement	2
Multiple Choice Tests	2
Statistical Analysis	2
Test Bias	2
Academic Standards	1
Adaptive Testing	1
Adults	1
Answer Keys	1
Biology	1
Classification	1
College Science	1
Computer Assisted Testing	1
Criterion Referenced Tests	1
Design Requirements	1
Difficulty Level	1
Early Childhood Education	1
More ▼

Source

Journal of Educational…

Author

Askegaard, Lewis D.	1
Berk, Ronald A.	1
Chang, Hua-Hua	1
Douglas, Jeff	1
Dwyer, Andrew C.	1
Frary, Robert B.	1
Frisbie, David A.	1
Haberman, Shelby	1
Joiner, Lee M.	1
Kim, Sooyeon	1
Lin, Haiyan	1
Norcini, John J.	1
Simon, Alan J.	1
Sweeney, Daryl C.	1
Umila, Benwardo V.	1
Wang, Shiyu	1
von Davier, Alina A.	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	7
Guides - Non-Classroom	1
Information Analyses	1
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Peabody Picture Vocabulary…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Hybrid Computerized Adaptive Testing: From Group Sequential Design to Fully Sequential Design

Peer reviewed

Direct link

Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff – Journal of Educational Measurement, 2016

Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Sequential Approach

Maintaining Equivalent Cut Scores for Small Sample Test Forms

Peer reviewed

Direct link

Dwyer, Andrew C. – Journal of Educational Measurement, 2016

This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards

Small-Sample Equating Using a Synthetic Linking Function

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008

This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…

Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis

A Mexican Version of the Peabody Picture Vocabulary Test.

Peer reviewed

Simon, Alan J.; Joiner, Lee M. – Journal of Educational Measurement, 1976

The purpose of this study was to determine whether a Mexican version of the Peabody Picture Vocabulary Test could be improved by directly translating both forms of the American test, then using decision procedures to select the better item of each pair. The reliability of the simple translations suffered. (Author/BW)

Descriptors: Early Childhood Education, Spanish, Test Construction, Test Format

A Consumers' Guide to Criterion-Referenced Test Reliability. Reliability.

Peer reviewed

Berk, Ronald A. – Journal of Educational Measurement, 1980

A dozen different approaches that yield 13 reliability indices for criterion-referenced tests were identified and grouped into three categories: threshold loss function, squared-error loss function, and domain score estimation. Indices were evaluated within each category. (Author/RL)

Descriptors: Classification, Criterion Referenced Tests, Cutting Scores, Evaluation Methods

The Answer Key as a Source of Error in Examinations for Professionals.

Peer reviewed

Norcini, John J. – Journal of Educational Measurement, 1987

Answer keys for physician and teacher licensing examinations were studied. The impact of variability on total errors of measurement was examined for answer keys constructed using the aggregate method. Results indicated that, in some cases, scorers contributed to a sizable reduction in measurement error. (Author/GDC)

Descriptors: Adults, Answer Keys, Error of Measurement, Evaluators

An Empirical Investigation of the Applicability of Multiple Matrix Sampling to the Method of Rank Order.

Peer reviewed

Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982

Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)

Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction

Multiple-Choice versus Free-Response: A Simulation Study.

Peer reviewed

Frary, Robert B. – Journal of Educational Measurement, 1985

Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)

Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)

The Relative Merits of Multiple True-False Achievement Tests.

Peer reviewed

Frisbie, David A.; Sweeney, Daryl C. – Journal of Educational Measurement, 1982

A 100-item five-choice multiple choice (MC) biology final exam was converted to multiple choice true-false (MTF) form to yield two content-parallel test forms comprised of the two item types. Students found the MTF items easier and preferred MTF over MC; the MTF subtests were more reliable. (Author/GK)

Descriptors: Biology, College Science, Comparative Analysis, Difficulty Level