Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Comparative Analysis | 13 |
| Error of Measurement | 13 |
| Multiple Choice Tests | 13 |
| Item Response Theory | 6 |
| Test Items | 6 |
| Foreign Countries | 4 |
| Higher Education | 4 |
| Item Analysis | 3 |
| Test Format | 3 |
| Accuracy | 2 |
| College Entrance Examinations | 2 |
| More ▼ | |
Source
Author
| ALKursheh, Taha Okleh | 1 |
| Al-zboon, Habis Saad | 1 |
| AlNasraween, Mo'en Salman | 1 |
| Choi, Jiwon | 1 |
| Curley, Edward | 1 |
| Córdova, Nora | 1 |
| Dartnell, Pablo | 1 |
| Dorans, Neil | 1 |
| Godoy, María Inés | 1 |
| Guo, Hongwen | 1 |
| Haberman, Shelby | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 11 |
| Journal Articles | 10 |
| Dissertations/Theses -… | 1 |
| Reports - Evaluative | 1 |
Education Level
| Higher Education | 4 |
| Postsecondary Education | 4 |
| Secondary Education | 2 |
| Elementary Education | 1 |
| Grade 6 | 1 |
| High Schools | 1 |
| Intermediate Grades | 1 |
Audience
Location
| Chile | 1 |
| Saudi Arabia | 1 |
| Taiwan | 1 |
| Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Advanced Placement… | 2 |
| SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Polat, Murat – International Online Journal of Education and Teaching, 2022
Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…
Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests
ALKursheh, Taha Okleh; Al-zboon, Habis Saad; AlNasraween, Mo'en Salman – International Journal of Instruction, 2022
This study aimed at comparing the effect of two test item formats (multiple-choice and complete) on estimating person's ability, item parameters and the test information function (TIF).To achieve the aim of the study, two format of mathematics(1) test have been created: multiple-choice and complete, In its final format consisted of (31) items. The…
Descriptors: Comparative Analysis, Test Items, Item Response Theory, Test Format
Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023
Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses
Wang, Wei – ProQuest LLC, 2013
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
Descriptors: Equated Scores, Test Format, Test Items, Test Length
Hsieh, Mingchuan – Language Assessment Quarterly, 2013
The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests
Guo, Hongwen; Liu, Jinghua; Curley, Edward; Dorans, Neil – ETS Research Report Series, 2012
This study examines the stability of the "SAT Reasoning Test"™ score scales from 2005 to 2010. A 2005 old form (OF) was administered along with a 2010 new form (NF). A new conversion for OF was derived through direct equipercentile equating. A comparison of the newly derived and the original OF conversions showed that Critical Reading…
Descriptors: Aptitude Tests, Cognitive Tests, Thinking Skills, Equated Scores
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008
Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…
Descriptors: Comparative Analysis, Scores, Tests, Certification
Peer reviewedZimmerman, Donald W.; And Others – Journal of Experimental Education, 1984
Three types of test were compared: a completion test, a matching test, and a multiple-choice test. The completion test was more reliable than the matching test, and the matching test was more reliable than the multiple-choice test. (Author/BW)
Descriptors: Comparative Analysis, Error of Measurement, Higher Education, Mathematical Models
Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006
A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…
Descriptors: Cheating, Test Items, Simulation, Statistical Analysis
Peer reviewedWoehr, David J.; And Others – Educational and Psychological Measurement, 1991
Methods for setting cutoff scores based on criterion performance, normative comparison, and absolute judgment were compared for scores on a multiple-choice psychology examination for 121 undergraduates and 251 undergraduates as a comparison group. All methods fell within the standard error of measurement. Implications of differences for decision…
Descriptors: Comparative Analysis, Concurrent Validity, Content Validity, Cutting Scores
Roid, Gale; Haladyna, Tom – 1978
The technology of transforming sentences from prose instruction into test questions was examined by comparing two methods of selecting sentences (keyword vs. rare singleton), two types of question words (nouns vs. adjectives), and two foil construction methods (writer's choice vs. algorithmic). Four item writers created items using each…
Descriptors: Algorithms, Cloze Procedure, Comparative Analysis, Criterion Referenced Tests
Morgan, Rick; Mazzeo, John – 1988
The dimensional structure of the 1987 Advanced Placement (AP) French language examination was tested in four populations using a series of confirmatory linear factor analysis models. To mitigate problems with the linear factor analysis of multiple choice items, the linear factor analysis of item parcel scores, made of small mutually exclusive…
Descriptors: Advanced Placement Programs, College Students, Comparative Analysis, Error of Measurement

Direct link
