ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	8

Descriptor

Comparative Analysis	13
Error of Measurement	13
Multiple Choice Tests	13
Item Response Theory	6
Test Items	6
Foreign Countries	4
Higher Education	4
Item Analysis	3
Test Format	3
Accuracy	2
College Entrance Examinations	2
Cutting Scores	2
Difficulty Level	2
Equated Scores	2
High Stakes Tests	2
Language Proficiency	2
Language Tests	2
Raw Scores	2
Reading Comprehension	2
Scores	2
Second Language Learning	2
Simulation	2
Standard Setting (Scoring)	2
Statistical Analysis	2
Undergraduate Students	2
More ▼

Source

ETS Research Report Series	2
Educational and Psychological…	2
Applied Psychological…	1
International Journal of…	1
International Online Journal…	1
Journal of Educational…	1
Journal of Experimental…	1
Language Assessment Quarterly	1
ProQuest LLC	1

Publication Type

Reports - Research	11
Journal Articles	10
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Higher Education	4
Postsecondary Education	4
Secondary Education	2
Elementary Education	1
Grade 6	1
High Schools	1
Intermediate Grades	1

Audience

Location

Chile	1
Saudi Arabia	1
Taiwan	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Comparison of Performance Measures Obtained from Foreign Language Tests According to Item Response Theory vs Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat – International Online Journal of Education and Teaching, 2022

Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…

Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests

The Effect of Item Form on Estimating Person's Ability, Item Parameters, and Information Function According to Item Response Theory (IRT)

Peer reviewed
PDF on ERIC

Download full text

ALKursheh, Taha Okleh; Al-zboon, Habis Saad; AlNasraween, Mo'en Salman – International Journal of Instruction, 2022

This study aimed at comparing the effect of two test item formats (multiple-choice and complete) on estimating person's ability, item parameters and the test information function (TIF).To achieve the aim of the study, two format of mathematics(1) test have been created: multiple-choice and complete, In its final format consisted of (31) items. The…

Descriptors: Comparative Analysis, Test Items, Item Response Theory, Test Format

Position of Correct Option and Distractors Impacts Responses to Multiple-Choice Items: Evidence from a National Test

Peer reviewed

Direct link

Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023

Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Comparing Yes/No Angoff and Bookmark Standard Setting Methods in the Context of English Assessment

Peer reviewed

Direct link

Hsieh, Mingchuan – Language Assessment Quarterly, 2013

The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…

Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests

The Stability of the Score Scales for the "SAT Reasoning Test"™ from 2005 to 2010. Research Report. ETS RR-12-15

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Liu, Jinghua; Curley, Edward; Dorans, Neil – ETS Research Report Series, 2012

This study examines the stability of the "SAT Reasoning Test"™ score scales from 2005 to 2010. A 2005 old form (OF) was administered along with a 2010 new form (NF). A new conversion for OF was derived through direct equipercentile equating. A comparison of the newly derived and the original OF conversions showed that Critical Reading…

Descriptors: Aptitude Tests, Cognitive Tests, Thinking Skills, Equated Scores

Comparison of Subscores Based on Classical Test Theory Methods. Research Report. ETS RR-08-54

Peer reviewed
PDF on ERIC

Download full text

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008

Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…

Descriptors: Comparative Analysis, Scores, Tests, Certification

Empirical Estimates of the Comparative Reliability of Matching Tests and Multiple-Choice Tests.

Peer reviewed

Zimmerman, Donald W.; And Others – Journal of Experimental Education, 1984

Three types of test were compared: a completion test, a matching test, and a multiple-choice test. The completion test was more reliable than the matching test, and the matching test was more reliable than the multiple-choice test. (Author/BW)

Descriptors: Comparative Analysis, Error of Measurement, Higher Education, Mathematical Models

Detecting Answer Copying Using the Kappa Statistic

Peer reviewed

Direct link

Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006

A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…

Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

An Empirical Comparison of Cutoff Score Methods for Content-Related and Criterion-Related Validity Settings.

Peer reviewed

Woehr, David J.; And Others – Educational and Psychological Measurement, 1991

Methods for setting cutoff scores based on criterion performance, normative comparison, and absolute judgment were compared for scores on a multiple-choice psychology examination for 121 undergraduates and 251 undergraduates as a comparison group. All methods fell within the standard error of measurement. Implications of differences for decision…

Descriptors: Comparative Analysis, Concurrent Validity, Content Validity, Cutting Scores

A Comparison of Several Multiple-Choice, Linguistic-Based Item Writing Algorithms.

Roid, Gale; Haladyna, Tom – 1978

The technology of transforming sentences from prose instruction into test questions was examined by comparing two methods of selecting sentences (keyword vs. rare singleton), two types of question words (nouns vs. adjectives), and two foil construction methods (writer's choice vs. algorithmic). Four item writers created items using each…

Descriptors: Algorithms, Cloze Procedure, Comparative Analysis, Criterion Referenced Tests

A Comparison of the Structural Relationships among Reading, Listening, Writing, and Speaking Components of the AP French Language Examination for AP Candidates and College Students.

Download full text

Morgan, Rick; Mazzeo, John – 1988

The dimensional structure of the 1987 Advanced Placement (AP) French language examination was tested in four populations using a series of confirmatory linear factor analysis models. To mitigate problems with the linear factor analysis of multiple choice items, the linear factor analysis of item parcel scores, made of small mutually exclusive…

Descriptors: Advanced Placement Programs, College Students, Comparative Analysis, Error of Measurement

ALKursheh, Taha Okleh	1
Al-zboon, Habis Saad	1
AlNasraween, Mo'en Salman	1
Choi, Jiwon	1
Curley, Edward	1
Córdova, Nora	1
Dartnell, Pablo	1
Dorans, Neil	1
Godoy, María Inés	1
Guo, Hongwen	1
Haberman, Shelby	1
Haladyna, Tom	1
Hsieh, Mingchuan	1
Jiménez, Daniela	1
Kang, Yujin	1
Kim, Stella Y.	1
Larkin, Kevin	1
Lee, Won-Chan	1
Lemarié, Julie	1
Lions, Séverin	1
Liu, Jinghua	1
Mazzeo, John	1
Meijer, Rob R.	1
Morgan, Rick	1
More ▼