Publication Date
| In 2026 | 0 |
| Since 2025 | 19 |
| Since 2022 (last 5 years) | 66 |
| Since 2017 (last 10 years) | 141 |
| Since 2007 (last 20 years) | 198 |
Descriptor
Source
Author
| Yamamoto, Kentaro | 6 |
| Shin, Hyo Jeong | 5 |
| von Davier, Matthias | 5 |
| Debeer, Dries | 4 |
| Janssen, Rianne | 4 |
| Khorramdel, Lale | 4 |
| Sälzer, Christine | 4 |
| Yang, Ji Seung | 4 |
| Braeken, Johan | 3 |
| Cai, Li | 3 |
| Ercikan, Kadriye | 3 |
| More ▼ | |
Publication Type
Education Level
| Secondary Education | 166 |
| Elementary Secondary Education | 15 |
| Elementary Education | 10 |
| Junior High Schools | 10 |
| Middle Schools | 10 |
| High Schools | 9 |
| Grade 9 | 7 |
| Higher Education | 7 |
| Grade 4 | 6 |
| Grade 8 | 6 |
| Intermediate Grades | 6 |
| More ▼ | |
Audience
| Administrators | 1 |
| Policymakers | 1 |
Location
| Australia | 23 |
| United States | 18 |
| Finland | 17 |
| China | 15 |
| Turkey | 15 |
| Canada | 14 |
| Germany | 13 |
| South Korea | 12 |
| Japan | 10 |
| Norway | 10 |
| Singapore | 10 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 2 |
| Education for All Handicapped… | 1 |
| Elementary and Secondary… | 1 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Huang, Hung-Yu – Educational and Psychological Measurement, 2020
In educational assessments and achievement tests, test developers and administrators commonly assume that test-takers attempt all test items with full effort and leave no blank responses with unplanned missing values. However, aberrant response behavior--such as performance decline, dropping out beyond a certain point, and skipping certain items…
Descriptors: Item Response Theory, Response Style (Tests), Test Items, Statistical Analysis
Ulitzsch, Esther; Domingue, Benjamin W.; Kapoor, Radhika; Kanopka, Klint; Rios, Joseph A. – Educational Measurement: Issues and Practice, 2023
Common response-time-based approaches for non-effortful response behavior (NRB) in educational achievement tests filter responses that are associated with response times below some threshold. These approaches are, however, limited in that they require a binary decision on whether a response is classified as stemming from NRB; thus ignoring…
Descriptors: Reaction Time, Responses, Behavior, Achievement Tests
Okan Bulut; Guher Gorgun; Hacer Karamese – Journal of Educational Measurement, 2025
The use of multistage adaptive testing (MST) has gradually increased in large-scale testing programs as MST achieves a balanced compromise between linear test design and item-level adaptive testing. MST works on the premise that each examinee gives their best effort when attempting the items, and their responses truly reflect what they know or can…
Descriptors: Response Style (Tests), Testing Problems, Testing Accommodations, Measurement
Kseniia Marcq; Johan Braeken – International Journal of Testing, 2025
The Programme for International Student Assessment (PISA) student questionnaire, despite being designed for low cognitive demand, may induce test burden due to its 306-item length, resulting in increased item nonresponse toward the questionnaire's end. Using the PISA 2018 response data from 80 countries and a cross-classified mixed effects model,…
Descriptors: Achievement Tests, Foreign Countries, Secondary School Students, International Assessment
Arijeta Hulaj; Eda Vula; Fatlume Berisha – European Education, 2025
Mathematical achievement and the factors that influence students' skills to apply their knowledge and reasoning to solving real-world problems are considered a critical area of study within the PISA framework. This study presents a systematic review of the impact of teaching strategies and practices on students' mathematical achievements in PISA.…
Descriptors: International Assessment, Achievement Tests, Foreign Countries, Secondary School Students
Andreas Frey; Christoph König; Aron Fink – Journal of Educational Measurement, 2025
The highly adaptive testing (HAT) design is introduced as an alternative test design for the Programme for International Student Assessment (PISA). The principle of HAT is to be as adaptive as possible when selecting items while accounting for PISA's nonstatistical constraints and addressing issues concerning PISA such as item position effects.…
Descriptors: Adaptive Testing, Test Construction, Alternative Assessment, Achievement Tests
Hyo Jeong Shin; Christoph König; Frederic Robin; Andreas Frey; Kentaro Yamamoto – Journal of Educational Measurement, 2025
Many international large-scale assessments (ILSAs) have switched to multistage adaptive testing (MST) designs to improve measurement efficiency in measuring the skills of the heterogeneous populations around the world. In this context, previous literature has reported the acceptable level of model parameter recovery under the MST designs when the…
Descriptors: Robustness (Statistics), Item Response Theory, Adaptive Testing, Test Construction
Cassandra N. Malcom – ProQuest LLC, 2024
Science, Technology, Engineering, and Math (STEM) skills are increasingly required of students to be successful in higher education and the workforce. Therefore, modeling assessment outcomes accurately, often using more types of student data to get a complete picture of student learning, is increasingly relevant. The Program for International…
Descriptors: Student Evaluation, STEM Education, Science Tests, Achievement Tests
Lundgren, Erik; Eklöf, Hanna – Educational Research and Evaluation, 2020
The present study used process data from a computer-based problem-solving task as indications of behavioural level of test-taking effort, and explored how behavioural item-level effort related to overall test performance and self-reported effort. Variables were extracted from raw process data and clustered. Four distinct clusters were obtained and…
Descriptors: Computer Assisted Testing, Problem Solving, Response Style (Tests), Test Items
Lu, Jing; Wang, Chun – Journal of Educational Measurement, 2020
Item nonresponses are prevalent in standardized testing. They happen either when students fail to reach the end of a test due to a time limit or quitting, or when students choose to omit some items strategically. Oftentimes, item nonresponses are nonrandom, and hence, the missing data mechanism needs to be properly modeled. In this paper, we…
Descriptors: Item Response Theory, Test Items, Standardized Tests, Responses
Yan, Zi; Chiu, Ming Ming – British Educational Research Journal, 2023
Despite the general consensus on the positive impact of formative assessment on student learning, researchers have not shown the underlying mechanisms between specific formative assessment strategies and academic performance on an international sample. This study examines the link between student and teacher reports of teachers' formative…
Descriptors: Formative Evaluation, Evaluation Methods, Reading Achievement, Correlation
Ulitzsch, Esther; Lüdtke, Oliver; Robitzsch, Alexander – Educational Measurement: Issues and Practice, 2023
Country differences in response styles (RS) may jeopardize cross-country comparability of Likert-type scales. When adjusting for rather than investigating RS is the primary goal, it seems advantageous to impose minimal assumptions on RS structures and leverage information from multiple scales for RS measurement. Using PISA 2015 background…
Descriptors: Response Style (Tests), Comparative Analysis, Achievement Tests, Foreign Countries
Militsa G. Ivanova; Michalis P. Michaelides – Practical Assessment, Research & Evaluation, 2023
Research on methods for measuring examinee engagement with constructed-response items is limited. The present study used data from the PISA 2018 Reading domain to construct and compare indicators of test-taking effort on constructed-response items: response time, number of actions, the union (combining effortless responses detected by either…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Kseniia Marcq; Johan Braeken – Educational Assessment, Evaluation and Accountability, 2024
Gender differences in item nonresponse are well-documented in high-stakes achievement tests, where female students are shown to omit more items than male students. These gender differences in item nonresponse are often linked to differential risk-taking strategies, with females being risk-averse and unwilling to guess on an item, even if it could…
Descriptors: Secondary School Students, International Assessment, Gender Differences, Response Rates (Questionnaires)
Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024
For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…
Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory

Peer reviewed
Direct link
