NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 202537
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 37 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Leonidas Zotos; Hedderik van Rijn; Malvina Nissim – International Educational Data Mining Society, 2025
In an educational setting, an estimate of the difficulty of Multiple-Choice Questions (MCQs), a commonly used strategy to assess learning progress, constitutes very useful information for both teachers and students. Since human assessment is costly from multiple points of view, automatic approaches to MCQ item difficulty estimation are…
Descriptors: Multiple Choice Tests, Test Items, Difficulty Level, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Changiz Mohiyeddini – Anatomical Sciences Education, 2025
This article presents a step-by-step guide to using R and SPSS to bootstrap exam questions. Bootstrapping, a versatile nonparametric analytical technique, can help to improve the psychometric qualities of exam questions in the process of quality assurance. Bootstrapping is particularly useful in disciplines such as medical education, where student…
Descriptors: Test Items, Sampling, Statistical Inference, Nonparametric Statistics
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Kseniia Marcq; Johan Braeken – Large-scale Assessments in Education, 2025
Background: Theoretical frameworks excel in conceptualising reading literacy, yet their value hinges on their applicability for real-world purposes, such as assessment. By combining diverse theoretical frameworks, the Programme for International Student Assessment (PISA) 2018 designed an assessment framework for assessing the reading literacy of…
Descriptors: International Assessment, Achievement Tests, Foreign Countries, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Linh Thi Thao Le; Nam Thi Phuong Ho; Nguyen Huynh Trang; Hung Tan Ha – SAGE Open, 2025
The International English Language Testing System (IELTS) has served as one of the most reliable proofs of people's English language proficiency. There have been rumors about the discrepancy in difficulty between the two modules of IELTS, namely Academic (AC) and General Training (GT); however, there is little empirical evidence to confirm such a…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Reading Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Patrik Havan; Michal Kohút; Peter Halama – International Journal of Testing, 2025
Acquiescence is the tendency of participants to shift their responses to agreement. Lechner et al. (2019) introduced the following mechanisms of acquiescence: social deference and cognitive processing. We added their interaction into a theoretical framework. The sample consists of 557 participants. We found significant medium strong relationship…
Descriptors: Cognitive Processes, Attention, Difficulty Level, Reflection
Peer reviewed Peer reviewed
Direct linkDirect link
Cornelia E. Neuert – Field Methods, 2025
Using masculine forms in surveys is still common practice, with researchers presumably assuming they operate in a generic way. However, the generic masculine has been found to lead to male-biased representations in various contexts. This article studies the effects of alternative gendered linguistic forms in surveys. The language forms are…
Descriptors: Language Usage, Surveys, Response Style (Tests), Gender Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…
Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025
We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…
Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Katrin Schuessler; Vanessa Fischer; Maik Walpuski – Instructional Science: An International Journal of the Learning Sciences, 2025
Cognitive load studies are mostly centered on information on perceived cognitive load. Single-item subjective rating scales are the dominant measurement practice to investigate overall cognitive load. Usually, either invested mental effort or perceived task difficulty is used as an overall cognitive load measure. However, the extent to which the…
Descriptors: Cognitive Processes, Difficulty Level, Rating Scales, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Necati Taskin – International Journal of Technology in Education, 2025
This study examines the effect of item order (random, increasingly difficult, and decreasingly difficult) on student performance, test parameters, and student perceptions in multiple-choice tests administered in a paper-and-pencil format after online learning. In the research conducted using an explanatory sequential mixed methods design,…
Descriptors: Test Items, Difficulty Level, Online Courses, College Freshmen
Previous Page | Next Page »
Pages: 1  |  2  |  3