Publication Date
In 2025 | 4 |
Since 2024 | 14 |
Since 2021 (last 5 years) | 80 |
Since 2016 (last 10 years) | 157 |
Since 2006 (last 20 years) | 252 |
Descriptor
Difficulty Level | 483 |
Item Analysis | 483 |
Test Items | 369 |
Test Construction | 148 |
Foreign Countries | 112 |
Multiple Choice Tests | 99 |
Test Validity | 90 |
Item Response Theory | 89 |
Test Reliability | 84 |
Comparative Analysis | 79 |
Statistical Analysis | 79 |
More ▼ |
Source
Author
Reckase, Mark D. | 6 |
Lord, Frederic M. | 5 |
Roid, Gale | 4 |
Bratfisch, Oswald | 3 |
Cahen, Leonard S. | 3 |
Dorans, Neil J. | 3 |
Dunne, Tim | 3 |
Facon, Bruno | 3 |
Hambleton, Ronald K. | 3 |
Huck, Schuyler W. | 3 |
Kostin, Irene | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 34 |
Practitioners | 4 |
Teachers | 2 |
Location
Nigeria | 8 |
Turkey | 8 |
Germany | 7 |
Indonesia | 6 |
South Africa | 6 |
Taiwan | 6 |
United States | 6 |
Canada | 5 |
India | 5 |
Florida | 4 |
Australia | 3 |
More ▼ |
Laws, Policies, & Programs
Education Consolidation… | 1 |
Elementary and Secondary… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Rachel Lee – ProQuest LLC, 2024
Classical item analysis (CIA) entails summarizing items based on two key attributes: item difficulty and item discrimination, defined as the proportion of examinees answering correctly and the difference in correctness between high and low scorers. Recent insights reveal a direct link between these measures and aspects of signal detection theory…
Descriptors: Item Analysis, Knowledge Level, Difficulty Level, Measurement
Gyamfi, Abraham; Acquaye, Rosemary – Acta Educationis Generalis, 2023
Introduction: Item response theory (IRT) has received much attention in validation of assessment instrument because it allows the estimation of students' ability from any set of the items. Item response theory allows the difficulty and discrimination levels of each item on the test to be estimated. In the framework of IRT, item characteristics are…
Descriptors: Item Response Theory, Models, Test Items, Difficulty Level
Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022
The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…
Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level
Wind, Stefanie A.; Ge, Yuan – Measurement: Interdisciplinary Research and Perspectives, 2023
In selected-response assessments such as attitude surveys with Likert-type rating scales, examinees often select from rating scale categories to reflect their locations on a construct. Researchers have observed that some examinees exhibit "response styles," which are systematic patterns of responses in which examinees are more likely to…
Descriptors: Goodness of Fit, Responses, Likert Scales, Models
Mahmut Sami Koyuncu; Mehmet Sata – International Journal of Assessment Tools in Education, 2023
The main aim of this study was to introduce the ConQuest program, which is used in the analysis of multivariate and multidimensional data structures, and to show its applications on example data structures. To achieve this goal, a basic research approach was applied. Thus, how to use the ConQuest program and how to prepare the data set for…
Descriptors: Data Analysis, Computer Oriented Programs, Models, Test Items
Stephen Humphry; Paul Montuoro; Carolyn Maxwell – Journal of Psychoeducational Assessment, 2024
This article builds upon a proiminent definition of construct validity that focuses on variation in attributes causing variation in measurement outcomes. This article synthesizes the defintion and uses Rasch measurement modeling to explicate a modified conceptualization of construct validity for assessments of developmental attributes. If…
Descriptors: Construct Validity, Measurement Techniques, Developmental Stages, Item Analysis
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021
This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…
Descriptors: Equated Scores, Scoring, Test Items, Accuracy
Haladyna, Thomas M.; Rodriguez, Michael C. – Educational Assessment, 2021
Full-information item analysis provides item developers and reviewers comprehensive empirical evidence of item quality, including option response frequency, point-biserial index (PBI) for distractors, mean-scores of respondents selecting each option, and option trace lines. The multi-serial index (MSI) is introduced as a more informative…
Descriptors: Test Items, Item Analysis, Reading Tests, Mathematics Tests
Thompson, Kathryn N. – ProQuest LLC, 2023
It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…
Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Thayaamol Upapong; Apantee Poonputta – Educational Process: International Journal, 2025
Background/purpose: The purposes of this research are to develop a reliable and valid assessment tool for measuring systems thinking skills in upper primary students in Thailand and to establish a normative criterion for evaluating their systems thinking abilities based on educational standards. Materials/methods: The study followed a three-phase…
Descriptors: Thinking Skills, Elementary School Students, Measures (Individuals), Foreign Countries
Camenares, Devin – International Journal for the Scholarship of Teaching and Learning, 2022
Balancing assessment of learning outcomes with the expectations of students is a perennial challenge in education. Difficult exams, in which many students perform poorly, exacerbate this problem and can inspire a wide variety of interventions, such as a grading curve. However, addressing poor performance can sometimes distort or inflate grades and…
Descriptors: College Students, Student Evaluation, Tests, Test Items
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Cronin, Sean D. – ProQuest LLC, 2023
This convergent, parallel, mixed-methods study with qualitative and quantitative content analysis methods was conducted to identify what type of thinking is required by the College and Career Readiness Assessment (CCRA+) by (a) determining the frequency and percentage of questions categorized as higher-level thinking within each cell of Hess'…
Descriptors: Cues, College Readiness, Career Readiness, Test Items
Tsutsumi, Emiko; Kinoshita, Ryo; Ueno, Maomi – International Educational Data Mining Society, 2021
Knowledge tracing (KT), the task of tracking the knowledge state of each student over time, has been assessed actively by artificial intelligence researchers. Recent reports have described that Deep-IRT, which combines Item Response Theory (IRT) with a deep learning model, provides superior performance. It can express the abilities of each student…
Descriptors: Item Response Theory, Prediction, Accuracy, Artificial Intelligence