Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 15 |
| Since 2017 (last 10 years) | 46 |
| Since 2007 (last 20 years) | 101 |
Descriptor
| Test Construction | 365 |
| Testing | 365 |
| Test Validity | 329 |
| Test Reliability | 182 |
| Language Tests | 78 |
| Scoring | 73 |
| Test Interpretation | 58 |
| Student Evaluation | 48 |
| English (Second Language) | 45 |
| Second Language Learning | 43 |
| Test Items | 43 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| California | 6 |
| New York | 6 |
| Australia | 4 |
| Canada | 4 |
| China | 4 |
| Pennsylvania | 4 |
| Brazil | 3 |
| Iran | 3 |
| Japan | 3 |
| Maryland | 3 |
| Nebraska | 3 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Every Student Succeeds Act… | 2 |
| Civil Rights Act 1964 Title… | 1 |
| Elementary and Secondary… | 1 |
| Lau v Nichols | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Patrisius Istiarto Djiwandono; Daniel Ginting – Language Education & Assessment, 2025
The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the…
Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning
Kun Su – ProQuest LLC, 2022
This dissertation provides a start-to-finish description of development, administration, and validation for an online middle-school physics test using a DCM framework with response-time. The first paper illustrated the process of implementing DCM with a careful selection of the content domain and a simulation approach for a Q-matrix construction.…
Descriptors: Science Instruction, Physics, Middle Schools, Testing
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Weideman, Albert – Language Assessment Quarterly, 2022
This paper will deal, firstly, with the South African context, that cries out for attention to responsible language assessment. The renewed interest in language testing in South Africa is well illustrated in assessments of language ability for educational purposes generally, and more specifically in the assessment of academic literacy. Secondly,…
Descriptors: Foreign Countries, Language Tests, Testing, Academic Language
Mansooreh Hosseinnia; Zahra Kafi – Language Testing in Asia, 2024
As testing involves various aspects of education as well as the ones who are involved like instructors, students, managers, teacher trainers, testers, and decision-makers, it comes to be highly crucial to develop ethical tests. In addition, as some methods of testing are more favored and practiced compared to others without considering the ethical…
Descriptors: Test Construction, Test Validity, Ethics, Testing
W. James Popham – Pearson, 2024
"Classroom Assessment" shows pre- and in-service teachers how to use classroom testing accurately and formatively to dramatically increase their teaching effectiveness and promote student learning. In addition to clear and concise guidelines on how to develop and use quality classroom assessments, the author also focuses on the teaching…
Descriptors: Student Evaluation, Testing, Teacher Effectiveness, Test Construction
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022
Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…
Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research
Bearman, Margaret; Ajjawi, Rola; Bennett, Sue; Boud, David – Advances in Health Sciences Education, 2021
Objective Structured Clinical Examinations (OSCEs) have become ubiquitous as a form of assessment in medical education but involve substantial resource demands and considerable local variation. A detailed understanding of the processes by which OSCEs are designed and administered could improve feasibility and sustainability. This exploration of…
Descriptors: Performance Based Assessment, Medical Education, Test Construction, Testing
NWEA, 2022
This technical report documents the processes and procedures employed by NWEA® to build and support the English MAP® Reading Fluency™ assessments administered during the 2020-2021 school year. It is written for measurement professionals and administrators to help evaluate the quality of MAP Reading Fluency. The seven sections of this report: (1)…
Descriptors: Achievement Tests, Reading Tests, Reading Achievement, Reading Fluency
Dadey, Nathan; Gong, Brian – Smarter Balanced Assessment Consortium, 2023
This document is written primarily for policy makers and state department of education staff who are considering through-year assessments, as well as consultants and contractors state departments rely on. The document identifies essential things to consider when designing or evaluating a through-year assessment program. The paper is organized into…
Descriptors: Student Evaluation, Progress Monitoring, Summative Evaluation, Standardized Tests
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring

Peer reviewed
Direct link
