Publication Date
In 2025 | 1 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 13 |
Since 2016 (last 10 years) | 21 |
Since 2006 (last 20 years) | 30 |
Descriptor
Source
Author
Publication Type
Journal Articles | 28 |
Reports - Research | 24 |
Reports - Evaluative | 6 |
Tests/Questionnaires | 3 |
Numerical/Quantitative Data | 2 |
Reports - Descriptive | 1 |
Education Level
Secondary Education | 25 |
Elementary Education | 4 |
Elementary Secondary Education | 4 |
Grade 6 | 2 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Location
Australia | 3 |
Turkey | 3 |
Canada | 2 |
Finland | 2 |
France | 2 |
United States | 2 |
Cyprus | 1 |
Germany | 1 |
Greece | 1 |
Japan | 1 |
Romania | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 31 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Maria Bolsinova; Jesper Tijmstra; Leslie Rutkowski; David Rutkowski – Journal of Educational and Behavioral Statistics, 2024
Profile analysis is one of the main tools for studying whether differential item functioning can be related to specific features of test items. While relevant, profile analysis in its current form has two restrictions that limit its usefulness in practice: It assumes that all test items have equal discrimination parameters, and it does not test…
Descriptors: Test Items, Item Analysis, Generalizability Theory, Achievement Tests
Selim Dasçioglu; Tuncay Ögretmen – International Journal of Assessment Tools in Education, 2024
The purpose of this research is to determine whether PISA 2018 mathematical literacy test items show a differential item functioning across countries. For this purpose, only the items in booklet number three were examined using the MIMIC method with Latent Class Analysis (LCA) approach. PISA 2018 tests are mostly developed in English. Therefore,…
Descriptors: Test Items, Item Analysis, Mathematics Tests, Literacy
Marjo Sirén; Sari Sulkunen – Scandinavian Journal of Educational Research, 2025
This study examined which aspects of critical literacy are focused on in the reading literacy assessment for the Programme for International Student Assessment (PISA) 2018 and what kinds of texts are related to the critical literacy items in the test. Based on theory-oriented qualitative content analysis, critical literacy items in PISA…
Descriptors: International Assessment, Achievement Tests, Foreign Countries, Secondary School Students
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Ahmet Yildirim; Nizamettin Koç – International Journal of Assessment Tools in Education, 2024
The present research aims to examine whether the questions in the Program for the International Student Assessment (PISA) 2009 reading literacy instrument display differential item functioning (DIF) among the Turkish, French, and American samples based on univariate and multivariate matching techniques before and after the total score, which is…
Descriptors: Test Items, Item Analysis, Correlation, Error of Measurement
Lundgren, Erik – Journal of Educational Data Mining, 2022
Response process data have the potential to provide a rich description of test-takers' thinking processes. However, retrieving insights from these data presents a challenge for educational assessments and educational data mining as they are complex and not well annotated. The present study addresses this challenge by developing a computational…
Descriptors: Problem Solving, Classification, Accuracy, Foreign Countries
Shear, Benjamin R. – Journal of Educational Measurement, 2023
Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…
Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests
Rujun Xu; James Soland – International Journal of Testing, 2024
International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…
Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries
Fairness and Comparability in Achievement Motivation Items: A Differential Item Functioning Analysis
Bialo, Jacquelyn A.; Li, Hongli – Journal of Psychoeducational Assessment, 2022
Achievement motivation is a well-documented predictor of a variety of positive student outcomes. However, given observed group differences in motivation and related outcomes, motivation instruments should be checked for comparable item and scale functioning. Therefore, the purpose of this study was to evaluate measurement scale comparability and…
Descriptors: Student Motivation, Academic Achievement, Item Analysis, Gender Differences
A Sequential Bayesian Changepoint Detection Procedure for Aberrant Behaviors in Computerized Testing
Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023
Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…
Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy
Uyar, Seyma – Eurasian Journal of Educational Research, 2020
Purpose: This study aimed to compare the performance of latent class differential item functioning (DIF) approach and IRT based DIF methods using manifest grouping. With this study, it was thought to draw attention to carry out latent class DIF studies in Turkey. The purpose of this study was to examine DIF in PISA 2015 science data set. Research…
Descriptors: Item Response Theory, Foreign Countries, Cross Cultural Studies, Item Analysis
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Ayan, Cansu; Baris Pekmezci, Fulya – International Journal of Assessment Tools in Education, 2021
Testlets have advantages such as making it possible to measure higher-order thinking skills and saving time, which are accepted in the literature. For this reason, they have often been preferred in many implementations from in-class assessments to large-scale assessments. Because of increased usage of testlets, the following questions are…
Descriptors: Foreign Countries, International Assessment, Secondary School Students, Achievement Tests
Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020
Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…
Descriptors: Test Items, Goodness of Fit, Probability, Accuracy
Pools, Elodie; Monseur, Christian – Large-scale Assessments in Education, 2021
Background: The idea of using low-stakes assessment results is often mentioned when designing educational system reforms. However, when tests have no consequences for the students, test takers may not make enough effort when completing the test, and their lack of engagement may negatively affect the validity of the conclusions of the studies that…
Descriptors: Science Tests, Test Validity, Student Motivation, Learner Engagement