Publication Date
| In 2026 | 0 |
| Since 2025 | 11 |
| Since 2022 (last 5 years) | 59 |
| Since 2017 (last 10 years) | 156 |
| Since 2007 (last 20 years) | 392 |
Descriptor
| Test Validity | 1014 |
| Test Reliability | 460 |
| Test Construction | 372 |
| Evaluation Methods | 189 |
| Elementary Secondary Education | 163 |
| Student Evaluation | 157 |
| Foreign Countries | 131 |
| Higher Education | 127 |
| Standardized Tests | 126 |
| Testing | 126 |
| Language Tests | 120 |
| More ▼ | |
Source
Author
| Stansfield, Charles W. | 11 |
| Kenyon, Dorry Mann | 4 |
| Popham, W. James | 4 |
| Sireci, Stephen G. | 4 |
| Abedi, Jamal | 3 |
| Brown, James Dean | 3 |
| Clarke, Ben | 3 |
| Halle, Tamara | 3 |
| Ketterlin-Geller, Leanne R. | 3 |
| Koretz, Daniel | 3 |
| Liu, Kimy | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 66 |
| Practitioners | 52 |
| Teachers | 21 |
| Administrators | 14 |
| Policymakers | 10 |
| Counselors | 2 |
| Community | 1 |
| Parents | 1 |
Location
| Canada | 15 |
| United Kingdom | 15 |
| Australia | 14 |
| United States | 13 |
| New York | 10 |
| Nebraska | 9 |
| Netherlands | 7 |
| Texas | 7 |
| Georgia | 6 |
| India | 6 |
| California | 5 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Jacqueline Raymond; David Wei Dai; Sue McAllister – Advances in Health Sciences Education, 2025
There is increasing interest in health professions education (HPE) in applying argument-based validity approaches, such as Kane's, to assessment design. The critical first step in employing Kane's approach is to specify the interpretation-use argument (IUA). However, in the HPE literature, this step is often poorly articulated. This article…
Descriptors: Allied Health Occupations Education, Test Interpretation, Test Construction, Inferences
Andrew P. Jaciw – American Journal of Evaluation, 2025
By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…
Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Desiree Kawabata; Ben Fenton-Smith – Australian Journal of Language and Literacy, 2025
This paper discusses the challenges of defining coherence in the context of oral language assessment literacy and proposes that better understanding of the construct can be achieved through a systemic-functional linguistic lens. Coherence is taken to be a foundational quality of written and spoken discourse and is a standard feature in the…
Descriptors: Oral Language, Assessment Literacy, Linguistics, English (Second Language)
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests
Philipp Sterner; Kim De Roover; David Goretzko – Structural Equation Modeling: A Multidisciplinary Journal, 2025
When comparing relations and means of latent variables, it is important to establish measurement invariance (MI). Most methods to assess MI are based on confirmatory factor analysis (CFA). Recently, new methods have been developed based on exploratory factor analysis (EFA); most notably, as extensions of multi-group EFA, researchers introduced…
Descriptors: Error of Measurement, Measurement Techniques, Factor Analysis, Structural Equation Models
Cara Cahalan Laitusis; Meagan Karvonen – Educational Measurement: Issues and Practice, 2025
The 2014 "Standards for Educational and Psychological Testing" describe universal design as an approach that offers promise for improving the fairness of educational assessments. As the field reconsiders questions of fairness in assessments, we propose a new framework that addresses the entire assessment lifecycle: universal design of…
Descriptors: Educational Assessment, Access to Education, Systems Approach, Psychological Needs
Denise Swanson; Gerald Tindal – Behavioral Research and Teaching, 2024
This technical report provides an authoritative bibliographic resource of all the studies conducted on "easyCBM"® and published on the main website for Behavioral Research and Teaching under Publications (https://brtprojects.org). The "easyCBM"© software is a direct descendent of "Curriculum-based Measurement" (CBM)…
Descriptors: Bibliographies, Computer Software, Test Construction, Test Reliability
Sonique Sailsman; Emma El-Shami – Quarterly Review of Distance Education, 2024
Nurse educators at the undergraduate level spend significant time developing and revising exam questions. Following the exam administration, course faculty have the opportunity to complete an item analysis and question revision to improve reliability and validity. A challenge faculty face is tracking these exam changes when teaching as part of a…
Descriptors: Nursing Education, Nursing Students, College Faculty, Test Construction
Scott J. Peters; Matthew C. Makel; Lindsay Ellis Lee; Tamra Stambaugh; Matthew T. McBee; D. Betsy McCoach; Kiana R. Johnson – Gifted Child Today, 2024
Universal screening is one of the most-common topics and well-accepted best practices within the field of gifted and talented education. There appears to be little disagreement that universally screening all students as part of a gifted and talented identification process results in fewer missed students. But surprisingly, there is little guidance…
Descriptors: Academically Gifted, Talent Identification, Screening Tests, Test Validity
Benjawan Plengkham; Sonthaya Rattanasak; Patsawut Sukserm – Journal of Education and Learning, 2025
This academic article provides the essential steps for designing an effective English questionnaire in social science research, with a focus on ensuring clarity, cultural sensitivity and ethical integrity. Developed from key insights from related studies, it outlines potential practice in questionnaire design, item development and the importance…
Descriptors: Guidelines, Test Construction, Questionnaires, Surveys
Maddox, Bryan – OECD Publishing, 2023
The digital transition in educational testing has introduced many new opportunities for technology to enhance large-scale assessments. These include the potential to collect and use log data on test-taker response processes routinely, and on a large scale. Process data has long been recognised as a valuable source of validation evidence in…
Descriptors: Measurement, Inferences, Test Reliability, Computer Assisted Testing

Peer reviewed
Direct link
