Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 43 |
| Since 2017 (last 10 years) | 133 |
| Since 2007 (last 20 years) | 194 |
Descriptor
| Item Response Theory | 219 |
| Test Items | 219 |
| Test Reliability | 219 |
| Test Validity | 110 |
| Foreign Countries | 76 |
| Difficulty Level | 72 |
| Psychometrics | 69 |
| Test Construction | 64 |
| Scores | 45 |
| Goodness of Fit | 35 |
| Item Analysis | 35 |
| More ▼ | |
Source
Author
| Schoen, Robert C. | 6 |
| Anderson, Daniel | 4 |
| Petscher, Yaacov | 4 |
| Bauduin, Charity | 3 |
| Boone, William J. | 3 |
| Paek, Insu | 3 |
| Yang, Xiaotong | 3 |
| Zhang, Jinming | 3 |
| Bichi, Ado Abdu | 2 |
| Brown, Ted | 2 |
| Dogan, Nuri | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
Location
| Indonesia | 16 |
| Florida | 8 |
| Turkey | 7 |
| Germany | 6 |
| United States | 6 |
| Taiwan | 5 |
| Australia | 4 |
| Iran | 4 |
| California | 3 |
| Canada | 3 |
| China | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025
This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…
Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025
Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…
Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Akif Avcu – International Journal of Psychology and Educational Studies, 2025
This review explores the significant contributions of Rasch modeling in enhancing classroom assessment practices, particularly in measuring student attitudes. Classroom assessment has evolved from standardized testing to integrative practices that emphasize both academic and affective dimensions of student development. Accurate attitude…
Descriptors: Item Response Theory, Student Attitudes, Student Evaluation, Attitude Measures
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Mahdi Ghorbankhani; Keyvan Salehi – SAGE Open, 2025
Academic procrastination, the tendency to delay academic tasks without reasonable justification, has significant implications for students' academic performance and overall well-being. To measure this construct, numerous scales have been developed, among which the Academic Procrastination Scale (APS) has shown promise in assessing academic…
Descriptors: Psychometrics, Measures (Individuals), Time Management, Foreign Countries
Aybek, Eren Can; Toraman, Cetin – International Journal of Assessment Tools in Education, 2022
The current study investigates the optimum number of response categories for the Likert type of scales under the item response theory (IRT). The data was collected from university students attend to mainly the faculty of medicine and the faculty of education. A form of the "Social Gender Equity Scale" developed by Gozutok et al. (2017)…
Descriptors: Likert Scales, Item Response Theory, College Students, Test Reliability
Grace C. Tetschner; Sachin Nedungadi – Chemistry Education Research and Practice, 2025
Many undergraduate chemistry students hold alternate conceptions related to resonance--an important and fundamental topic of organic chemistry. To help address these alternate conceptions, an organic chemistry instructor could administer the resonance concept inventory (RCI), which is a multiple-choice assessment that was designed to identify…
Descriptors: Scientific Concepts, Concept Formation, Item Response Theory, Scores
Thompson, Kathryn N. – ProQuest LLC, 2023
It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…
Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores
Balbuena, Sherwin – International Journal of Assessment Tools in Education, 2023
Depression is a latent characteristic that is measured through self-reported or clinician-mediated instruments such as scales and inventories. The precision of depression estimates largely depends on the validity of the items used and on the truthfulness of people responding to these items. The existing methodology in instrumentation based on a…
Descriptors: Depression (Psychology), Test Items, Test Validity, Test Reliability
Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…
Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity
Arandha May Rachmawati; Agus Widyantoro – English Language Teaching Educational Journal, 2025
This study aims to evaluate the quality of English reading comprehension test instruments used in informal learning, especially as English literacy tests. With a quantitative approach, the analysis was carried out using the Rasch model through the Quest program on 30 multiple-choice questions given to 30 grade IX students from informal educational…
Descriptors: Item Response Theory, Reading Tests, Reading Comprehension, English (Second Language)
Fergadiotis, Gerasimos; Casilio, Marianne; Dickey, Michael Walsh; Steel, Stacey; Nicholson, Hannele; Fleegle, Mikala; Swiderski, Alexander; Hula, William D. – Journal of Speech, Language, and Hearing Research, 2023
Purpose: Item response theory (IRT) is a modern psychometric framework with several advantageous properties as compared with classical test theory. IRT has been successfully used to model performance on anomia tests in individuals with aphasia; however, all efforts to date have focused on noun production accuracy. The purpose of this study is to…
Descriptors: Item Response Theory, Psychometrics, Verbs, Naming

Peer reviewed
Direct link
