Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 10 |
Since 2006 (last 20 years) | 15 |
Descriptor
Difficulty Level | 27 |
Test Format | 27 |
Test Validity | 27 |
Test Items | 21 |
Multiple Choice Tests | 11 |
Test Construction | 11 |
Test Reliability | 11 |
Comparative Analysis | 9 |
Foreign Countries | 9 |
Language Tests | 9 |
Computer Assisted Testing | 7 |
More ▼ |
Source
Author
Ali, Syed Haris | 1 |
Allen, Nancy L. | 1 |
Arce-Ferrer, Alvaro J. | 1 |
Boone, William J. | 1 |
Bulut, Okan | 1 |
Burton, J. Dylan | 1 |
Carlson, James E. | 1 |
Carr, Patrick A. | 1 |
Chandralekha Singh | 1 |
Chen, Jing | 1 |
Christoph, Simon | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 6 |
Postsecondary Education | 6 |
Elementary Education | 4 |
Grade 4 | 2 |
Grade 8 | 2 |
High Schools | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Early Childhood Education | 1 |
Grade 3 | 1 |
Grade 7 | 1 |
More ▼ |
Audience
Practitioners | 1 |
Researchers | 1 |
Location
Germany | 2 |
Netherlands | 2 |
China | 1 |
European Union | 1 |
Japan | 1 |
Louisiana | 1 |
Mexico | 1 |
North Dakota | 1 |
Sweden | 1 |
United Kingdom | 1 |
United Kingdom (England) | 1 |
More ▼ |
Laws, Policies, & Programs
Pell Grant Program | 1 |
Assessments and Surveys
National Assessment of… | 2 |
Test of English as a Foreign… | 2 |
ACT Assessment | 1 |
Defining Issues Test | 1 |
Sequential Tests of… | 1 |
What Works Clearinghouse Rating
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Yangqiuting Li; Chandralekha Singh – Physical Review Physics Education Research, 2025
Research-based multiple-choice questions implemented in class with peer instruction have been shown to be an effective tool for improving students' engagement and learning outcomes. Moreover, multiple-choice questions that are carefully sequenced to build on each other can be particularly helpful for students to develop a systematic understanding…
Descriptors: Physics, Science Instruction, Science Tests, Multiple Choice Tests
Burton, J. Dylan – Language Assessment Quarterly, 2023
The effects of question or task complexity on second language speaking have traditionally been investigated using complexity, accuracy, and fluency measures. Response processes in speaking tests, however, may manifest in other ways, such as through nonverbal behavior. Eye behavior, in the form of averted gaze or blinking frequency, has been found…
Descriptors: Oral Language, Speech Communication, Language Tests, Eye Movements
Arce-Ferrer, Alvaro J.; Bulut, Okan – Journal of Experimental Education, 2019
This study investigated the performance of four widely used data-collection designs in detecting test-mode effects (i.e., computer-based versus paper-based testing). The experimental conditions included four data-collection designs, two test-administration modes, and the availability of an anchor assessment. The test-level and item-level results…
Descriptors: Data Collection, Test Construction, Test Format, Computer Assisted Testing
Masrai, Ahmed – SAGE Open, 2022
Vocabulary size measures serve important functions, not only with respect to placing learners at appropriate levels on language courses but also with a view to examining the progress of learners. One of the widely reported formats suitable for these purposes is the Yes/No vocabulary test. The primary aim of this study was to introduce and provide…
Descriptors: Vocabulary Development, Language Tests, English (Second Language), Second Language Learning
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Schoen, Robert C.; Yang, Xiaotong; Liu, Sicong; Paek, Insu – Grantee Submission, 2017
The Early Fractions Test v2.2 is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test v2.2 is to serve as a measure of student outcomes in a randomized trial designed to estimate the effect of an educational…
Descriptors: Psychometrics, Mathematics Tests, Mathematics Achievement, Fractions
Ali, Syed Haris; Carr, Patrick A.; Ruit, Kenneth G. – Journal of the Scholarship of Teaching and Learning, 2016
Plausible distractors are important for accurate measurement of knowledge via multiple-choice questions (MCQs). This study demonstrates the impact of higher distractor functioning on validity and reliability of scores obtained on MCQs. Freeresponse (FR) and MCQ versions of a neurohistology practice exam were given to four cohorts of Year 1 medical…
Descriptors: Scores, Multiple Choice Tests, Test Reliability, Test Validity
Culligan, Brent – Language Testing, 2015
This study compared three common vocabulary test formats, the Yes/No test, the Vocabulary Knowledge Scale (VKS), and the Vocabulary Levels Test (VLT), as measures of vocabulary difficulty. Vocabulary difficulty was defined as the item difficulty estimated through Item Response Theory (IRT) analysis. Three tests were given to 165 Japanese students,…
Descriptors: Language Tests, Test Format, Comparative Analysis, Vocabulary
Schwichow, Martin; Christoph, Simon; Boone, William J.; Härtig, Hendrik – International Journal of Science Education, 2016
The so-called control-of-variables strategy (CVS) incorporates the important scientific reasoning skills of designing controlled experiments and interpreting experimental outcomes. As CVS is a prominent component of science standards appropriate assessment instruments are required to measure these scientific reasoning skills and to evaluate the…
Descriptors: Thinking Skills, Science Instruction, Science Experiments, Science Tests
Chen, Jing; Sheehan, Kathleen M. – ETS Research Report Series, 2015
The "TOEFL"® family of assessments includes the "TOEFL"® Primary"™, "TOEFL Junior"®, and "TOEFL iBT"® tests. The linguistic complexity of stimulus passages in the reading sections of the TOEFL family of assessments is expected to differ across the test levels. This study evaluates the linguistic…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Reading Comprehension
DeStefano, Lizanne; Johnson, Jeremiah – American Institutes for Research, 2013
This paper describes one of the first efforts by the National Assessment of Educational Progress (NAEP) to improve measurement at the lower end of the distribution, including measurement for students with disabilities (SD) and English language learners (ELLs). One way to improve measurement at the lower end is to introduce one or more…
Descriptors: National Competency Tests, Measures (Individuals), Disabilities, English Language Learners
Nehm, Ross H.; Schonfeld, Irvin Sam – Journal of Research in Science Teaching, 2008
Growing recognition of the central importance of fostering an in-depth understanding of natural selection has, surprisingly, failed to stimulate work on the development and rigorous evaluation of instruments that measure knowledge of it. We used three different methodological tools, the Conceptual Inventory of Natural Selection (CINS), a modified…
Descriptors: Evolution, Science Education, Interviews, Measures (Individuals)

Weiten, Wayne – Journal of Experimental Education, 1982
A comparison of double as opposed to single multiple-choice questions yielded significant differences in regard to item difficulty, item discrimination, and internal reliability, but not concurrent validity. (Author/PN)
Descriptors: Difficulty Level, Educational Testing, Higher Education, Multiple Choice Tests
Previous Page | Next Page »
Pages: 1 | 2