Publication Date
| In 2026 | 0 |
| Since 2025 | 21 |
| Since 2022 (last 5 years) | 85 |
| Since 2017 (last 10 years) | 252 |
| Since 2007 (last 20 years) | 377 |
Descriptor
| Difficulty Level | 411 |
| Foreign Countries | 411 |
| Test Items | 411 |
| Item Response Theory | 129 |
| Test Reliability | 103 |
| Test Construction | 100 |
| Test Validity | 93 |
| Item Analysis | 89 |
| Multiple Choice Tests | 87 |
| Mathematics Tests | 68 |
| Science Tests | 67 |
| More ▼ | |
Source
Author
| Bulut, Okan | 5 |
| Retnawati, Heri | 5 |
| Baghaei, Purya | 4 |
| Long, Caroline | 4 |
| Baird, Jo-Anne | 3 |
| Crisp, Victoria | 3 |
| Janssen, Rianne | 3 |
| Planinic, Maja | 3 |
| Adadan, Emine | 2 |
| Ahmadi, Alireza | 2 |
| Andrich, David | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 3 |
| Practitioners | 2 |
| Policymakers | 1 |
| Students | 1 |
| Teachers | 1 |
Location
| Turkey | 46 |
| Indonesia | 35 |
| Germany | 31 |
| Australia | 26 |
| Canada | 19 |
| South Africa | 15 |
| China | 13 |
| Taiwan | 13 |
| United Kingdom | 13 |
| Iran | 12 |
| Nigeria | 12 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kseniia Marcq; Johan Braeken – Large-scale Assessments in Education, 2025
Background: Theoretical frameworks excel in conceptualising reading literacy, yet their value hinges on their applicability for real-world purposes, such as assessment. By combining diverse theoretical frameworks, the Programme for International Student Assessment (PISA) 2018 designed an assessment framework for assessing the reading literacy of…
Descriptors: International Assessment, Achievement Tests, Foreign Countries, Secondary School Students
Patrik Havan; Michal Kohút; Peter Halama – International Journal of Testing, 2025
Acquiescence is the tendency of participants to shift their responses to agreement. Lechner et al. (2019) introduced the following mechanisms of acquiescence: social deference and cognitive processing. We added their interaction into a theoretical framework. The sample consists of 557 participants. We found significant medium strong relationship…
Descriptors: Cognitive Processes, Attention, Difficulty Level, Reflection
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Cornelia E. Neuert – Field Methods, 2025
Using masculine forms in surveys is still common practice, with researchers presumably assuming they operate in a generic way. However, the generic masculine has been found to lead to male-biased representations in various contexts. This article studies the effects of alternative gendered linguistic forms in surveys. The language forms are…
Descriptors: Language Usage, Surveys, Response Style (Tests), Gender Bias
Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…
Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity
Katrin Schuessler; Vanessa Fischer; Maik Walpuski – Instructional Science: An International Journal of the Learning Sciences, 2025
Cognitive load studies are mostly centered on information on perceived cognitive load. Single-item subjective rating scales are the dominant measurement practice to investigate overall cognitive load. Usually, either invested mental effort or perceived task difficulty is used as an overall cognitive load measure. However, the extent to which the…
Descriptors: Cognitive Processes, Difficulty Level, Rating Scales, Construct Validity
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Martin Steinbach; Carolin Eitemüller; Marc Rodemer; Maik Walpuski – International Journal of Science Education, 2025
The intricate relationship between representational competence and content knowledge in organic chemistry has been widely debated, and the ways in which representations contribute to task difficulty, particularly in assessment, remain unclear. This paper presents a multiple-choice test instrument for assessing individuals' knowledge of fundamental…
Descriptors: Organic Chemistry, Difficulty Level, Multiple Choice Tests, Fundamental Concepts
Jeong-eun Kim – English Teaching, 2025
This study investigated the thematic and lexical characteristics of high-difficulty English reading items--commonly referred to as "killer questions"--in the Korean College Scholastic Ability Test (CSAT) between 2018 and 2025. Using text mining methods, including Latent Dirichlet Allocation (LDA) and CEFR-based lexical profiling, the…
Descriptors: English (Second Language), Difficulty Level, Test Items, Questioning Techniques
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Nese Öztürk Gübes – International Journal of Assessment Tools in Education, 2025
The Trends in International Mathematics and Science Study (TIMSS) was administered via computer, eTIMSS, for the first time in 2019. The purpose of this study was to investigate item block position and item format effect on eighth grade mathematics item easiness in low- and high-achieving countries of eTIMSS 2019. Item responses from Chile, Qatar,…
Descriptors: Foreign Countries, International Assessment, Achievement Tests, Mathematics Achievement
Ludewig, Ulrich; Schwerter, Jakob; McElvany, Nele – Journal of Psychoeducational Assessment, 2023
A better understanding of how distractor features influence the plausibility of distractors is essential for an efficient multiple-choice (MC) item construction in educational assessment. The plausibility of distractors has a major influence on the psychometric characteristics of MC items. Our analysis utilizes the nominal categories model to…
Descriptors: Vocabulary, Language Tests, German, Grade 4
Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…
Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment
Suwita Suwita; Sulistyo Saputro; Sajidan Sajidan; Sutarno Sutarno – Journal of Baltic Science Education, 2024
The current study uses the Rasch Model to measure lower-secondary school students' critical thinking skills on photosynthesis topics. Critical thinking skills are considered essential in science education, but few valid and practical measurement instruments remain. The current study fills the gap by adapting the instrument from the Watson-Glaser…
Descriptors: Secondary School Students, Critical Thinking, Thinking Skills, Botany

Peer reviewed
Direct link
