Publication Date
| In 2026 | 0 |
| Since 2025 | 17 |
| Since 2022 (last 5 years) | 74 |
| Since 2017 (last 10 years) | 162 |
| Since 2007 (last 20 years) | 247 |
Descriptor
| Testing Problems | 1334 |
| Test Validity | 322 |
| Higher Education | 253 |
| Test Construction | 245 |
| Test Reliability | 231 |
| Foreign Countries | 215 |
| Elementary Secondary Education | 208 |
| Test Items | 196 |
| Standardized Tests | 186 |
| Achievement Tests | 183 |
| Test Bias | 172 |
| More ▼ | |
Source
Author
| Weiss, David J. | 7 |
| Frary, Robert B. | 6 |
| Reckase, Mark D. | 6 |
| Sinharay, Sandip | 6 |
| Wainer, Howard | 6 |
| Wilcox, Rand R. | 6 |
| Drasgow, Fritz | 5 |
| Green, Donald Ross | 5 |
| Plake, Barbara S. | 5 |
| Donlon, Thomas F. | 4 |
| Engelhard, George, Jr. | 4 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 21 |
| China | 16 |
| Iran | 15 |
| Australia | 11 |
| Japan | 11 |
| Turkey | 11 |
| United States | 9 |
| Israel | 8 |
| United Kingdom | 8 |
| New Jersey | 7 |
| Sweden | 7 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards with or without Reservations | 1 |
Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024
In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…
Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries
Using Performance Assessments Instead of High-Stakes Tests: A Promising Strategy for a Better Future
Hani Morgan – Policy Futures in Education, 2025
According to the National Assessment of Educational Progress (NAEP), US reading and math scores have recently dropped. This decline is believed to be the result of the school closings that occurred during the COVID-19 pandemic. The drop likely contributed to the increase in pressure educators are feeling to teach in a way that leads to higher test…
Descriptors: Performance Based Assessment, High Stakes Tests, Standardized Tests, Testing Problems
Yi Zou; Ying Zheng; Jingwen Wang – International Journal of Language Testing, 2025
The Pearson Test of English Academic (PTE-A), a widely used high-stakes language proficiency test for university admissions and migration purposes, underwent a notable change from a three-hour to a two-hour version in November 2021. The implementation of the new version has prompted inquiries into the washback effects on various stakeholders.…
Descriptors: Testing Problems, Test Preparation, High Stakes Tests, English (Second Language)
Hyeryung Lee; Walter P. Vispoel – Journal of Educational Measurement, 2025
Traditional methods for detecting cheating on assessments tend to focus on either identifying cheaters or compromised items in isolation, overlooking their interconnection. In this study, we present a novel biclustering approach that simultaneously detects both cheaters and compromised items by identifying coherent subgroups of examinees and items…
Descriptors: Identification, Cheating, Test Wiseness, Test Items
Esra Sözer Boz – Education and Information Technologies, 2025
International large-scale assessments provide cross-national data on students' cognitive and non-cognitive characteristics. A critical methodological issue that often arises in comparing data from cross-national studies is ensuring measurement invariance, indicating that the construct under investigation is the same across the compared groups.…
Descriptors: Achievement Tests, International Assessment, Foreign Countries, Secondary School Students
Sasima Charubusp; Orawan Wangsombat; Napatacha Sriwichai; Chanida Phongnapharuk – PASAA: Journal of Language Teaching and Learning in Thailand, 2025
Washback refers to the impact of a test on instruction and learning, with high-stakes tests exerting both positive and negative effects. This study examined the washback of an English exit exam (EEE) on English language learning at a Thai university where English-medium instruction is used in most academic disciplines. The EEE is an in-house…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests
Okan Bulut; Guher Gorgun; Hacer Karamese – Journal of Educational Measurement, 2025
The use of multistage adaptive testing (MST) has gradually increased in large-scale testing programs as MST achieves a balanced compromise between linear test design and item-level adaptive testing. MST works on the premise that each examinee gives their best effort when attempting the items, and their responses truly reflect what they know or can…
Descriptors: Response Style (Tests), Testing Problems, Testing Accommodations, Measurement
Taehyeong Kim; Byungmin Lee – Language Assessment Quarterly, 2025
The Korean College Scholastic Ability Test (CSAT) aims to assess Korean high school students' scholastic ability required for college readiness. As a high-stakes test, the examination serves as a pivotal hurdle for university admission and exerts a strong washback effect on the educational system in Korea. The present study set out to investigate…
Descriptors: Reading Comprehension, Reading Tests, Language Tests, Multiple Choice Tests
Peiyu Wang; Liying Cheng – Critical Inquiry in Language Studies, 2025
This study employed a multi-methods design to investigate the impact of preparation on Chinese test-takers' perceptions of the integrated TOEFL iBT speaking and writing design. Combining results from over 1700 surveys and 10 interviews, it was found that these Chinese test-takers, who are the most vulnerable group in the multimillion testing…
Descriptors: Foreign Countries, Second Language Learning, English (Second Language), Language Tests
Firdissa J. Aga – Intersection: A Journal at the Intersection of Assessment and Learning, 2024
The study investigated hurdles to the quality of student learning assessment by examining issues related to assessment procedures and practices, learners and learning, learning resources and test constructs, and test admin and feedback. Quantitative and qualitative data were collected from two Ethiopian universities using two types of…
Descriptors: Foreign Countries, College Faculty, College Students, Test Construction
Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…
Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment
Alex Buckley – Studies in Higher Education, 2024
Despite a large amount of critical research literature, traditional examinations continue to be widely used in higher education. This article reviews recent literature in order to assess the role played by the approaches adopted by researchers in the gap between research on exams, and the way exams are used. Viviane Robinson's 'problem-based…
Descriptors: Literature Reviews, Testing, Higher Education, Testing Problems
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
James Dean Brown; Ali Panahi; Hassan Mohebbi – Language Teaching Research Quarterly, 2023
Panahi and Mohebbi review James Dean Brown's 50-years of research in language testing, curriculum development and research statistics with reference to an impressionistic framework for analysis containing two components with their subcomponents: Annotations (i.e., briefing and implications) and main concepts and themes (i.e., testing and teaching…
Descriptors: Second Language Learning, Second Language Instruction, Language Tests, Curriculum Development
Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022
The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…
Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores

Peer reviewed
Direct link
