ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	10

Descriptor

Comparative Analysis	19
Test Format	19
Test Items	7
Scores	5
Computer Assisted Testing	4
Item Response Theory	4
Multiple Choice Tests	4
Test Construction	4
Difficulty Level	3
Mathematics Tests	3
Test Reliability	3
Test Validity	3
Accuracy	2
Achievement Tests	2
Adaptive Testing	2
Advanced Placement	2
Biology	2
Classification	2
College Entrance Examinations	2
Equated Scores	2
Error of Measurement	2
High School Students	2
Item Analysis	2
Sample Size	2
Statistical Analysis	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	19
Reports - Research	13
Reports - Evaluative	4
Reports - Descriptive	2

Education Level

Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

North Carolina End of Course…	1
Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Score Comparability between Online Proctored and In-Person Credentialing Exams

Peer reviewed

Direct link

Jones, Paul; Tong, Ye; Liu, Jinghua; Borglum, Joshua; Primoli, Vince – Journal of Educational Measurement, 2022

This article studied two methods to detect mode effects in two credentialing exams. In Study 1, we used a "modal scale comparison approach," where the same pool of items was calibrated separately, without transformation, within two TC cohorts (TC1 and TC2) and one OP cohort (OP1) matched on their pool-based scale score distributions. The…

Descriptors: Scores, Credentials, Licensing Examinations (Professions), Computer Assisted Testing

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Peer reviewed

Direct link

Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022

While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…

Descriptors: Scoring, Testing, Test Items, Test Format

A Comparison of Constraint Programming and Mixed-Integer Programming for Automated Test-Form Generation

Peer reviewed

Direct link

Li, Jie; van der Linden, Wim J. – Journal of Educational Measurement, 2018

The final step of the typical process of developing educational and psychological tests is to place the selected test items in a formatted form. The step involves the grouping and ordering of the items to meet a variety of formatting constraints. As this activity tends to be time-intensive, the use of mixed-integer programming (MIP) has been…

Descriptors: Programming, Automation, Test Items, Test Format

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Gender Bias in Test Item Formats: Evidence from PISA 2009, 2012, and 2015 Math and Reading Tests

Peer reviewed

Direct link

Shear, Benjamin R. – Journal of Educational Measurement, 2023

Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…

Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests

On-the-Fly Constraint-Controlled Assembly Methods for Multistage Adaptive Testing for Cognitive Diagnosis

Peer reviewed

Direct link

Liu, Shuchang; Cai, Yan; Tu, Dongbo – Journal of Educational Measurement, 2018

This study applied the mode of on-the-fly assembled multistage adaptive testing to cognitive diagnosis (CD-OMST). Several and several module assembly methods for CD-OMST were proposed and compared in terms of measurement precision, test security, and constrain management. The module assembly methods in the study included the maximum priority index…

Descriptors: Adaptive Testing, Monte Carlo Methods, Computer Security, Clinical Diagnosis

Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

Peer reviewed

Direct link

Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017

Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…

Descriptors: Comparative Analysis, Accuracy, Models, Sample Size

Detecting Answer Copying Using Alternate Test Forms and Seat Locations in Small-Scale Examinations

Peer reviewed

Direct link

van der Ark, L. Andries; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational Measurement, 2008

Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the…

Descriptors: Cheating, Statistics, Test Format, Measures (Individuals)

Small-Sample Equating Using a Synthetic Linking Function

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008

This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…

Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis

A Comparison of Three Methods of Establishing Cut-Off Scores on Criterion-Referenced Tests.

Peer reviewed

Mills, Craig N. – Journal of Educational Measurement, 1983

This study compares the results obtained using the Angoff, borderline group, and contrasting groups methods of determining performance standards. Congruent results were obtained from the Angoff and contrasting groups methods for several test forms. Borderline group standards were not similar to standards obtained with other methods. (Author/PN)

Descriptors: Comparative Analysis, Criterion Referenced Tests, Cutting Scores, Standard Setting (Scoring)

How Well Can We Compare Scores on Test Forms That Are Constructed by Examinees' Choice?

Peer reviewed

Wainer, Howard; And Others – Journal of Educational Measurement, 1994

The comparability of scores on test forms that are constructed through examinee item choice is examined in an item response theory framework. The approach is illustrated with data from the College Board's Advanced Placement Test in Chemistry taken by over 18,000 examinees. (SLD)

Descriptors: Advanced Placement, Chemistry, Comparative Analysis, Constructed Response

The Effect of Stimulus Format on Discriminability in School Surveys.

Peer reviewed

Jaeger, Richard M.; Wolf, Marian B. – Journal of Educational Measurement, 1982

The effectiveness of a Likert-scale and three paired-choice presentation formats in discriminating among parents' preferences for curriculum elements were compared. Paired-choice formats gave more reliable discriminations which increased with stimulus specificity. Similarities and differences in preference orderings are discussed. (Author/CM)

Descriptors: Comparative Analysis, Elementary Education, Parent Attitudes, Parent School Relationship

A Comparison of Developmental Scales Based on Thurstone Methods and Item Response Theory.

Peer reviewed

Williams, Valerie S. L.; Pommerich, Mary; Thissen, David – Journal of Educational Measurement, 1998

Created a developmental scale for the North Carolina End-of-Grade Mathematics Tests using a subset of identical test forms administered to adjacent grade levels with Thurstone scaling and Item Response Theory methods. Discusses differences in patterns produced. (Author/SLD)

Descriptors: Achievement Tests, Child Development, Comparative Analysis, Elementary Secondary Education

Multiple-Choice versus Free-Response: A Simulation Study.

Peer reviewed

Frary, Robert B. – Journal of Educational Measurement, 1985

Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)

Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)

Previous Page | Next Page »

Pages: 1 | 2

Kolen, Michael J.	2
Ali, Usama S.	1
Baldwin, Peter	1
Bennett, Randy Elliot	1
Borglum, Joshua	1
Cai, Yan	1
Choi, Jiwon	1
Clauser, Brian E.	1
Debeer, Dries	1
Emons, Wilco H. M.	1
Frary, Robert B.	1
Frisbie, David A.	1
Haberman, Shelby	1
Harris, Deborah J.	1
Jaeger, Richard M.	1
Jones, Paul	1
Kang, Yujin	1
Kim, Sooyeon	1
Kim, Stella Y.	1
Lee, Won-Chan	1
Li, Jie	1
Liu, Chunyan	1
Liu, Jinghua	1
Liu, Shuchang	1
More ▼