ERIC - Search Results

Publication Date

In 2026	0
Since 2025	41
Since 2022 (last 5 years)	191
Since 2017 (last 10 years)	538
Since 2007 (last 20 years)	988

Descriptor

Difficulty Level	1624
Test Items	1624
Item Response Theory	461
Test Construction	429
Foreign Countries	413
Item Analysis	378
Multiple Choice Tests	300
Test Reliability	280
Test Validity	249
Comparative Analysis	216
Scores	212
Mathematics Tests	205
Test Format	199
Higher Education	177
Statistical Analysis	177
Computer Assisted Testing	166
Achievement Tests	164
Psychometrics	159
Correlation	149
Models	145
Language Tests	137
Science Tests	136
Test Bias	130
English (Second Language)	114
Elementary School Students	111
More ▼

Education Level

Higher Education	263
Postsecondary Education	225
Secondary Education	213
Elementary Education	182
Middle Schools	111
High Schools	82
Junior High Schools	82
Elementary Secondary Education	62
Grade 8	62
Intermediate Grades	58
Grade 4	48
Grade 7	44
Primary Education	42
Early Childhood Education	41
Grade 6	36
Grade 5	35
Grade 3	34
Grade 2	19
Grade 1	16
Grade 12	16
Grade 9	14
Kindergarten	14
Grade 10	11
Grade 11	7
Adult Education	4
More ▼

Audience

Researchers	56
Teachers	10
Practitioners	8
Policymakers	4
Students	2
Administrators	1

Location

Turkey	47
Indonesia	35
Germany	31
Australia	26
Canada	19
United States	16
South Africa	15
California	14
Florida	14
China	13
Nigeria	13
Taiwan	13
United Kingdom	13
Iran	12
Malaysia	11
United Kingdom (England)	11
Japan	10
New York	10
South Korea	10
Belgium	8
Israel	7
Netherlands	7
Singapore	7
Indiana	6
New Jersey	6
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	3
Education Consolidation…	1
Elementary and Secondary…	1
Head Start	1

What Works Clearinghouse Rating

Showing 1 to 15 of 1,624 results Save | Export

Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of Model Uncertainty for Question Difficulty Estimation

Peer reviewed
PDF on ERIC

Download full text

Leonidas Zotos; Hedderik van Rijn; Malvina Nissim – International Educational Data Mining Society, 2025

In an educational setting, an estimate of the difficulty of Multiple-Choice Questions (MCQs), a commonly used strategy to assess learning progress, constitutes very useful information for both teachers and students. Since human assessment is costly from multiple points of view, automatic approaches to MCQ item difficulty estimation are…

Descriptors: Multiple Choice Tests, Test Items, Difficulty Level, Artificial Intelligence

Evaluation of Exam Questions Using Bootstrapping: Practical Applications in R and SPSS with a Case Study

Peer reviewed

Direct link

Changiz Mohiyeddini – Anatomical Sciences Education, 2025

This article presents a step-by-step guide to using R and SPSS to bootstrap exam questions. Bootstrapping, a versatile nonparametric analytical technique, can help to improve the psychometric qualities of exam questions in the process of quality assurance. Bootstrapping is particularly useful in disciplines such as medical education, where student…

Descriptors: Test Items, Sampling, Statistical Inference, Nonparametric Statistics

From Item Estimates to Test Operations: The Cascading Effect of Rapid Guessing

Peer reviewed

Direct link

Sarah Alahmadi; Christine E. DeMars – Journal of Educational Measurement, 2025

Inadequate test-taking effort poses a significant challenge, particularly when low-stakes test results inform high-stakes policy and psychometric decisions. We examined how rapid guessing (RG), a common form of low test-taking effort, biases item parameter estimates, particularly the discrimination and difficulty parameters. Previous research…

Descriptors: Guessing (Tests), Computation, Statistical Bias, Test Items

Scoring Running Records: Complexities and Affordances

Peer reviewed

Direct link

Rodgers, Emily; D'Agostino, Jerome V.; Berenbon, Rebecca; Johnson, Tracy; Winkler, Christa – Journal of Early Childhood Literacy, 2023

Running Records are thought to be an excellent formative assessment tool because they generate results that educators can use to make their teaching more responsive. Despite the technical nature of scoring Running Records and the kinds of important decisions that are attached to their analysis, few studies have investigated assessor accuracy. We…

Descriptors: Formative Evaluation, Scoring, Accuracy, Difficulty Level

Embedding Embedded Standard Setting: An Application of Cross-Classified Item Response Theory. CRESST Report 876

Download full text

Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025

This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…

Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level

From Framework to Functionality: A Cross-Country Analysis of PISA 2018 Reading Assessment Framework's Item Features as Determinants of Item Difficulty

Peer reviewed

Direct link

Kseniia Marcq; Johan Braeken – Large-scale Assessments in Education, 2025

Background: Theoretical frameworks excel in conceptualising reading literacy, yet their value hinges on their applicability for real-world purposes, such as assessment. By combining diverse theoretical frameworks, the Programme for International Student Assessment (PISA) 2018 designed an assessment framework for assessing the reading literacy of…

Descriptors: International Assessment, Achievement Tests, Foreign Countries, Secondary School Students

Revisiting the Lexical Differences between Academic and General Training IELTS Reading Tests

Peer reviewed

Direct link

Linh Thi Thao Le; Nam Thi Phuong Ho; Nguyen Huynh Trang; Hung Tan Ha – SAGE Open, 2025

The International English Language Testing System (IELTS) has served as one of the most reliable proofs of people's English language proficiency. There have been rumors about the discrepancy in difficulty between the two modules of IELTS, namely Academic (AC) and General Training (GT); however, there is little empirical evidence to confirm such a…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Reading Tests

The Accuracy of Estimating Parameters of Multiple-Choice Test Items, Following Item-Response Theory: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025

Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…

Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items

Interaction of Social Deference and Cognitive Processing in the Prediction of Acquiescence

Peer reviewed

Direct link

Patrik Havan; Michal Kohút; Peter Halama – International Journal of Testing, 2025

Acquiescence is the tendency of participants to shift their responses to agreement. Lechner et al. (2019) introduced the following mechanisms of acquiescence: social deference and cognitive processing. We added their interaction into a theoretical framework. The sample consists of 557 participants. We found significant medium strong relationship…

Descriptors: Cognitive Processes, Attention, Difficulty Level, Reflection

A Chi-Square Statistic for Testing the Equality of Distracters' Plausibility in Multiple-Choice Test Items

Download full text

Sherwin E. Balbuena – Online Submission, 2024

This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…

Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

How Do Alternative Gendered Linguistic Forms Affect Response Behavior in Surveys?

Peer reviewed

Direct link

Cornelia E. Neuert – Field Methods, 2025

Using masculine forms in surveys is still common practice, with researchers presumably assuming they operate in a generic way. However, the generic masculine has been found to lead to male-biased representations in various contexts. This article studies the effects of alternative gendered linguistic forms in surveys. The language forms are…

Descriptors: Language Usage, Surveys, Response Style (Tests), Gender Bias

Validation and Psychometric Properties of the Computational Thinking Multidimensional Test

Peer reviewed
PDF on ERIC

Download full text

Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025

In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…

Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Text-Based Question Difficulty Prediction: A Systematic Review of Automatic Approaches

Peer reviewed

Direct link

Samah AlKhuzaey; Floriana Grasso; Terry R. Payne; Valentina Tamma – International Journal of Artificial Intelligence in Education, 2024

Designing and constructing pedagogical tests that contain items (i.e. questions) which measure various types of skills for different levels of students equitably is a challenging task. Teachers and item writers alike need to ensure that the quality of assessment materials is consistent, if student evaluations are to be objective and effective.…

Descriptors: Test Items, Test Construction, Difficulty Level, Prediction

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 109

Tindal, Gerald	21
Alonzo, Julie	16
Plake, Barbara S.	13
Reckase, Mark D.	12
Anderson, Daniel	9
Sinharay, Sandip	9
Wise, Steven L.	9
Herrmann-Abell, Cari F.	8
Kostin, Irene	8
Park, Bitnara Jasmine	8
Bulut, Okan	7
DeBoer, George E.	7
Dorans, Neil J.	7
Guo, Hongwen	7
Irvin, P. Shawn	7
Liu, Kimy	7
Paek, Insu	7
Schoen, Robert C.	7
Smith, Richard M.	7
Bejar, Isaac I.	6
Freedle, Roy	6
Huntley, Renee M.	6
Ketterlin-Geller, Leanne R.	6
Liu, Jinghua	6
More ▼

Reports - Research	1225
Journal Articles	1051
Speeches/Meeting Papers	260
Reports - Evaluative	206
Reports - Descriptive	82
Tests/Questionnaires	62
Dissertations/Theses -…	52
Numerical/Quantitative Data	35
Information Analyses	26
Opinion Papers	14
Guides - Non-Classroom	9
Non-Print Media	4
Reference Materials - General	4
Collected Works - Proceedings	3
Reports - General	3
Collected Works - General	2
Computer Programs	2
ERIC Digests in Full Text	2
ERIC Publications	2
Books	1
Collected Works - Serials	1
Guides - Classroom - Teacher	1
Guides - General	1
Multilingual/Bilingual…	1
More ▼

SAT (College Admission Test)	40
Program for International…	29
National Assessment of…	25
Graduate Record Examinations	23
Trends in International…	23
Test of English as a Foreign…	20
ACT Assessment	8
Advanced Placement…	7
Raven Progressive Matrices	7
Peabody Picture Vocabulary…	6
Stanford Achievement Tests	6
Comprehensive Tests of Basic…	5
International English…	5
California Achievement Tests	4
Measures of Academic Progress	4
New Jersey College Basic…	4
Progress in International…	4
Raven Advanced Progressive…	4
Remote Associates Test	4
Test of English for…	4
Wechsler Intelligence Scale…	4
Armed Services Vocational…	3
Flesch Kincaid Grade Level…	3
Graduate Management Admission…	3
Law School Admission Test	3
More ▼

Educational and Psychological…	89
Journal of Educational…	62
ProQuest LLC	51
Applied Measurement in…	45
ETS Research Report Series	41
Online Submission	28
Applied Psychological…	24
International Journal of…	24
Behavioral Research and…	20
Educational Assessment	18
Grantee Submission	17
International Journal of…	17
Language Testing	17
Practical Assessment,…	17
Journal of Experimental…	16
Language Assessment Quarterly	15
Educational Measurement:…	11
International Journal of…	11
Physical Review Physics…	11
Psychometrika	11
International Journal of…	10
Journal of Educational…	10
Journal of Educational and…	10
SAGE Open	10
Journal of Chemical Education	9
More ▼