Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 30 |
| Since 2007 (last 20 years) | 76 |
Descriptor
| Test Items | 143 |
| Test Validity | 106 |
| Test Construction | 91 |
| Test Reliability | 45 |
| Item Analysis | 26 |
| Item Response Theory | 25 |
| Validity | 23 |
| Evaluation Methods | 20 |
| Psychometrics | 20 |
| Scores | 20 |
| Scoring | 19 |
| More ▼ | |
Source
Author
| Stansfield, Charles W. | 6 |
| Liu, Kimy | 3 |
| Sireci, Stephen G. | 3 |
| Embretson, Susan E. | 2 |
| Ferrando, Pere J. | 2 |
| Geller, Josh | 2 |
| Irvin, P. Shawn | 2 |
| Jung, Eunju | 2 |
| Ketterlin-Geller, Leanne R. | 2 |
| Lee, Yi-Hsuan | 2 |
| Petscher, Yaacov | 2 |
| More ▼ | |
Publication Type
Education Level
| Elementary Education | 9 |
| Elementary Secondary Education | 8 |
| Grade 5 | 8 |
| Higher Education | 8 |
| Secondary Education | 8 |
| Grade 4 | 7 |
| Middle Schools | 7 |
| Grade 6 | 6 |
| Grade 7 | 6 |
| High Schools | 6 |
| Junior High Schools | 6 |
| More ▼ | |
Location
| Australia | 3 |
| Massachusetts | 3 |
| Missouri | 3 |
| Oregon | 3 |
| Florida | 2 |
| Idaho | 2 |
| Japan | 2 |
| New Mexico | 2 |
| Tennessee | 2 |
| Washington | 2 |
| Canada | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Comprehensive Education… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Meike Akveld; George Kinnear – International Journal of Mathematical Education in Science and Technology, 2024
Many universities use diagnostic tests to assess incoming students' preparedness for mathematics courses. Diagnostic test results can help students to identify topics where they need more practice and give lecturers a summary of strengths and weaknesses in their class. We demonstrate a process that can be used to make improvements to a mathematics…
Descriptors: Mathematics Tests, Diagnostic Tests, Test Items, Item Analysis
Deng, Jacky M.; Streja, Nicholas; Flynn, Alison B. – Journal of Chemical Education, 2021
Response process validity evidence can provide researchers with insight into how and why participants interpret items on instruments such as tests and questionnaires. In chemistry education research literature and the social sciences more broadly, response process validity evidence has been used and reported in a variety of ways. This paper's…
Descriptors: Chemistry, Science Education, Educational Research, Validity
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
Marc Brysbaert – Cognitive Research: Principles and Implications, 2024
Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…
Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis
Martin, David; Jamieson-Proctor, Romina – International Journal of Research & Method in Education, 2020
In Australia, one of the key findings of the Teacher Education Ministerial Advisory Group was that not all graduating pre-service teachers possess adequate pedagogical content knowledge (PCK) to teach effectively. The concern is that higher education providers working with pre-service teachers are using pedagogical practices and assessments which…
Descriptors: Test Construction, Preservice Teachers, Pedagogical Content Knowledge, Foreign Countries
Chen, Yunxiao; Lee, Yi-Hsuan; Li, Xiaoou – Journal of Educational and Behavioral Statistics, 2022
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric…
Descriptors: Standardized Tests, Test Items, Test Validity, Scores
Fournier, Geneviève; Lachance, Lise; Viviers, Simon; Lahrizi, Imane Zineb; Goyer, Liette; Masdonati, Jonas – International Journal for Educational and Vocational Guidance, 2020
The paper presents first the theoretical foundations used to develop a pre-experimental version of a questionnaire on relationship to work, and then the four stages of its initial validation leading to an experimental version. These stages included: (1) Defining the dimensions and sub-dimensions of the relationship to work concept; (2)…
Descriptors: Test Construction, Content Validity, Work Attitudes, Test Items
Krupa, Erin Elizabeth; Carney, Michele; Bostic, Jonathan – Applied Measurement in Education, 2019
This article provides a brief introduction to the set of four articles in the special issue. To provide a foundation for the issue, key terms are defined, a brief historical overview of validity is provided, and a description of several different validation approaches used in the issue are explained. Finally, the contribution of the articles to…
Descriptors: Test Items, Program Validation, Test Validity, Mathematics Education
Cobern, William W.; Adams, Betty A. J. – International Journal of Assessment Tools in Education, 2020
What follows is a practical guide for establishing the validity of a survey for research purposes. The motivation for providing this guide is our observation that researchers, not necessarily being survey researchers per se, but wanting to use a survey method, lack a concise resource on validity. There is far more to know about surveys and survey…
Descriptors: Surveys, Test Validity, Test Construction, Test Items
Anderson, Daniel; Rowley, Brock; Stegenga, Sondra; Irvin, P. Shawn; Rosenberg, Joshua M. – Educational Measurement: Issues and Practice, 2020
Validity evidence based on test content is critical to meaningful interpretation of test scores. Within high-stakes testing and accountability frameworks, content-related validity evidence is typically gathered via alignment studies, with panels of experts providing qualitative judgments on the degree to which test items align with the…
Descriptors: Content Validity, Artificial Intelligence, Test Items, Vocabulary
Maddox, Bryan – OECD Publishing, 2023
The digital transition in educational testing has introduced many new opportunities for technology to enhance large-scale assessments. These include the potential to collect and use log data on test-taker response processes routinely, and on a large scale. Process data has long been recognised as a valuable source of validation evidence in…
Descriptors: Measurement, Inferences, Test Reliability, Computer Assisted Testing
Thomas Bickerton, Robert; Sangwin, Chris J. – International Journal of Mathematical Education in Science and Technology, 2022
We discuss a practical method for assessing mathematical proof online. We examine the use of faded worked examples and reading comprehension questions to understand proof. By breaking down a given proof, we formulate a checklist that can be used to generate comprehension questions which can be assessed automatically online. We then provide some…
Descriptors: Mathematics Instruction, Validity, Mathematical Logic, Evaluation Methods
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification

Peer reviewed
Direct link
