Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 10 |
Since 2006 (last 20 years) | 22 |
Descriptor
Item Analysis | 26 |
Test Items | 19 |
Foreign Countries | 11 |
Item Response Theory | 7 |
Achievement Tests | 5 |
Comparative Analysis | 5 |
Evaluation Methods | 5 |
International Assessment | 5 |
Models | 5 |
Psychometrics | 5 |
Scoring | 5 |
More ▼ |
Source
International Journal of… | 26 |
Author
Publication Type
Journal Articles | 26 |
Reports - Research | 15 |
Reports - Descriptive | 5 |
Reports - Evaluative | 5 |
Education Level
Higher Education | 5 |
Elementary Secondary Education | 4 |
Secondary Education | 4 |
Elementary Education | 2 |
Grade 4 | 2 |
High Schools | 2 |
Intermediate Grades | 2 |
Postsecondary Education | 2 |
Adult Education | 1 |
Preschool Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 3 |
Progress in International… | 1 |
SAT (College Admission Test) | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Tatiana di Lucia Faion Franchi; Felipe Valentini; Leonardo Botinhon de Campos; Letícia da Silva de Souza; Pedro Vanni; Leonardo de Barros Mose; Ricardo Primi – International Journal of Testing, 2024
This study investigated the application of item quadruplets to control social desirability in measuring socio-emotional competencies and their relationship with intelligence. Quadruplets involve four variations of an item, differing in the polarity of their descriptive content and evaluative content (social desirability): positive-desirable,…
Descriptors: Social Desirability, Social Emotional Learning, Interpersonal Competence, Emotional Intelligence
Saatcioglu, Fatima Munevver; Sen, Sedat – International Journal of Testing, 2023
In this study, we illustrated an application of the confirmatory mixture IRT model for multidimensional tests. We aimed to examine the differences in student performance by domains with a confirmatory mixture IRT modeling approach. A three-dimensional and three-class model was analyzed by assuming content domains as dimensions and cognitive…
Descriptors: Item Response Theory, Foreign Countries, Elementary Secondary Education, Achievement Tests
Bulut, Hatice Cigdem; Bulut, Okan; Arikan, Serkan – International Journal of Testing, 2023
This study examined group differences in online reading comprehension (ORC) using student data from the 2016 administration of the Progress in International Reading Literacy Study (ePIRLS). An explanatory item response modeling approach was used to explore the effects of item properties (i.e., item format, text complexity, and cognitive…
Descriptors: International Assessment, Achievement Tests, Grade 4, Foreign Countries
Rujun Xu; James Soland – International Journal of Testing, 2024
International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…
Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries
Michaelides, Michalis P.; Ivanova, Militsa; Nicolaou, Christiana – International Journal of Testing, 2020
The study examined the relationship between examinees' test-taking effort and their accuracy rate on items from the PISA 2015 assessment. The 10% normative threshold method was applied on Science multiple-choice items in the Cyprus sample to detect rapid guessing behavior. Results showed that the extent of rapid guessing across simple and complex…
Descriptors: Accuracy, Multiple Choice Tests, International Assessment, Achievement Tests
Wiberg, Marie; von Davier, Alina A. – International Journal of Testing, 2017
We propose a comprehensive procedure for the implementation of a quality control process of anchor tests for a college admissions test with multiple consecutive administrations. We propose to examine the anchor tests and their items in connection with covariates to investigate if there was any unusual behavior in the anchor test results over time…
Descriptors: College Entrance Examinations, Test Items, Equated Scores, Quality Control
Guenole, Nigel; Chernyshenko, Oleksandr S.; Weekly, Jeff – International Journal of Testing, 2017
Situational judgment tests (SJTs) are widely agreed to be a measurement technique. It is also widely agreed that SJTs are a questionable methodological choice for measurement of psychological constructs, such as behavioral competencies, due to a lack of evidence supporting appropriate factor structures and high internal consistencies.…
Descriptors: Situational Tests, Psychological Evaluation, Test Construction, Industrial Psychology
Solano-Flores, Guillermo; Wang, Chao; Shade, Chelsey – International Journal of Testing, 2016
We examined multimodality (the representation of information in multiple semiotic modes) in the context of international test comparisons. Using Program of International Student Assessment (PISA)-2009 data, we examined the correlation of the difficulty of science items and the complexity of their illustrations. We observed statistically…
Descriptors: Semiotics, Difficulty Level, Test Items, Science Tests
Maddox, Bryan; Zumbo, Bruno D.; Tay-Lim, Brenda; Qu, Demin – International Journal of Testing, 2015
This article explores the potential for ethnographic observations to inform the analysis of test item performance. In 2010, a standardized, large-scale adult literacy assessment took place in Mongolia as part of the United Nations Educational, Scientific and Cultural Organization Literacy Assessment and Monitoring Programme (LAMP). In a novel form…
Descriptors: Anthropology, Psychometrics, Ethnography, Adult Literacy
Engelhard, George, Jr.; Kobrin, Jennifer L.; Wind, Stefanie A. – International Journal of Testing, 2014
The purpose of this study is to explore patterns in model-data fit related to subgroups of test takers from a large-scale writing assessment. Using data from the SAT, a calibration group was randomly selected to represent test takers who reported that English was their best language from the total population of test takers (N = 322,011). A…
Descriptors: College Entrance Examinations, Writing Tests, Goodness of Fit, English
Carlson, Janet F.; Geisinger, Kurt F. – International Journal of Testing, 2012
The test review process used by the Buros Center for Testing is described as a series of 11 steps: (1) identifying tests to be reviewed, (2) obtaining tests and preparing test descriptions, (3) determining whether tests meet review criteria, (4) identifying appropriate reviewers, (5) selecting reviewers, (6) sending instructions and materials to…
Descriptors: Testing, Test Reviews, Evaluation Methods, Evaluation Criteria
Stark, Stephen; Chernyshenko, Oleksandr S. – International Journal of Testing, 2011
This article delves into a relatively unexplored area of measurement by focusing on adaptive testing with unidimensional pairwise preference items. The use of such tests is becoming more common in applied non-cognitive assessment because research suggests that this format may help to reduce certain types of rater error and response sets commonly…
Descriptors: Test Length, Simulation, Adaptive Testing, Item Analysis
Muniz, Jose; Fernandez-Hermida, Jose R.; Fonseca-Pedrero, Eduardo; Campillo-Alvarez, Angela; Pena-Suarez, Elsa – International Journal of Testing, 2012
The proper use of psychological tests requires that the measurement instruments have adequate psychometric properties, such as reliability and validity, and that the professionals who use the instruments have the necessary expertise. In this article, we present the first review of tests published in Spain, carried out with an assessment model…
Descriptors: Student Evaluation, Measurement, Foreign Countries, Psychometrics
Previous Page | Next Page »
Pages: 1 | 2