ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	22

Descriptor

Item Analysis	26
Test Items	19
Foreign Countries	11
Item Response Theory	7
Achievement Tests	5
Comparative Analysis	5
Evaluation Methods	5
International Assessment	5
Models	5
Psychometrics	5
Scoring	5
Correlation	4
Error of Measurement	4
Factor Analysis	4
Computer Software	3
Measurement	3
Measurement Techniques	3
Psychological Testing	3
Secondary School Students	3
Statistical Analysis	3
Test Bias	3
Test Construction	3
Test Reviews	3
Test Validity	3
Testing	3
More ▼

Source

International Journal of…

Publication Type

Journal Articles	26
Reports - Research	15
Reports - Descriptive	5
Reports - Evaluative	5

Education Level

Higher Education	5
Elementary Secondary Education	4
Secondary Education	4
Elementary Education	2
Grade 4	2
High Schools	2
Intermediate Grades	2
Postsecondary Education	2
Adult Education	1
Preschool Education	1

Audience

Location

Argentina	1
Brazil	1
Canada	1
Cyprus	1
Japan	1
Mongolia	1
Spain	1
Sweden	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	3
Progress in International…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

IRT Linking Methods for the Bifactor Model with Mixed Format Tests

Peer reviewed

Direct link

Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025

This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…

Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis

Detecting Differential Item Functioning with Multiple Causes: A Comparison of Three Methods

Peer reviewed

Direct link

Xiaowen Liu – International Journal of Testing, 2024

Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…

Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation

Social Desirability, Social-Emotional Competencies and Intelligence: Using Quadruplets to Estimate Evaluative and Descriptive Content

Peer reviewed

Direct link

Tatiana di Lucia Faion Franchi; Felipe Valentini; Leonardo Botinhon de Campos; Letícia da Silva de Souza; Pedro Vanni; Leonardo de Barros Mose; Ricardo Primi – International Journal of Testing, 2024

This study investigated the application of item quadruplets to control social desirability in measuring socio-emotional competencies and their relationship with intelligence. Quadruplets involve four variations of an item, differing in the polarity of their descriptive content and evaluative content (social desirability): positive-desirable,…

Descriptors: Social Desirability, Social Emotional Learning, Interpersonal Competence, Emotional Intelligence

The Analysis of TIMSS 2015 Data with Confirmatory Mixture Item Response Theory: A Multidimensional Approach

Peer reviewed

Direct link

Saatcioglu, Fatima Munevver; Sen, Sedat – International Journal of Testing, 2023

In this study, we illustrated an application of the confirmatory mixture IRT model for multidimensional tests. We aimed to examine the differences in student performance by domains with a confirmatory mixture IRT modeling approach. A three-dimensional and three-class model was analyzed by assuming content domains as dimensions and cognitive…

Descriptors: Item Response Theory, Foreign Countries, Elementary Secondary Education, Achievement Tests

Evaluating Group Differences in Online Reading Comprehension: The Impact of Item Properties

Peer reviewed

Direct link

Bulut, Hatice Cigdem; Bulut, Okan; Arikan, Serkan – International Journal of Testing, 2023

This study examined group differences in online reading comprehension (ORC) using student data from the 2016 administration of the Progress in International Reading Literacy Study (ePIRLS). An explanatory item response modeling approach was used to explore the effects of item properties (i.e., item format, text complexity, and cognitive…

Descriptors: International Assessment, Achievement Tests, Grade 4, Foreign Countries

Beyond Group Comparisons: Accounting for Intersectional Sources of Bias in International Survey Measures

Peer reviewed

Direct link

Rujun Xu; James Soland – International Journal of Testing, 2024

International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…

Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries

The Relationship between Response-Time Effort and Accuracy in PISA Science Multiple Choice Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Ivanova, Militsa; Nicolaou, Christiana – International Journal of Testing, 2020

The study examined the relationship between examinees' test-taking effort and their accuracy rate on items from the PISA 2015 assessment. The 10% normative threshold method was applied on Science multiple-choice items in the Cyprus sample to detect rapid guessing behavior. Results showed that the extent of rapid guessing across simple and complex…

Descriptors: Accuracy, Multiple Choice Tests, International Assessment, Achievement Tests

Examining the Impact of Covariates on Anchor Tests to Ascertain Quality over Time in a College Admissions Test

Peer reviewed

Direct link

Wiberg, Marie; von Davier, Alina A. – International Journal of Testing, 2017

We propose a comprehensive procedure for the implementation of a quality control process of anchor tests for a college admissions test with multiple consecutive administrations. We propose to examine the anchor tests and their items in connection with covariates to investigate if there was any unusual behavior in the anchor test results over time…

Descriptors: College Entrance Examinations, Test Items, Equated Scores, Quality Control

On Designing Construct Driven Situational Judgment Tests: Some Preliminary Recommendations

Peer reviewed

Direct link

Guenole, Nigel; Chernyshenko, Oleksandr S.; Weekly, Jeff – International Journal of Testing, 2017

Situational judgment tests (SJTs) are widely agreed to be a measurement technique. It is also widely agreed that SJTs are a questionable methodological choice for measurement of psychological constructs, such as behavioral competencies, due to a lack of evidence supporting appropriate factor structures and high internal consistencies.…

Descriptors: Situational Tests, Psychological Evaluation, Test Construction, Industrial Psychology

International Semiotics: Item Difficulty and the Complexity of Science Item Illustrations in the PISA-2009 International Test Comparison

Peer reviewed

Direct link

Solano-Flores, Guillermo; Wang, Chao; Shade, Chelsey – International Journal of Testing, 2016

We examined multimodality (the representation of information in multiple semiotic modes) in the context of international test comparisons. Using Program of International Student Assessment (PISA)-2009 data, we examined the correlation of the difficulty of science items and the complexity of their illustrations. We observed statistically…

Descriptors: Semiotics, Difficulty Level, Test Items, Science Tests

An Anthropologist among the Psychometricians: Assessment Events, Ethnography, and Differential Item Functioning in the Mongolian Gobi

Peer reviewed

Direct link

Maddox, Bryan; Zumbo, Bruno D.; Tay-Lim, Brenda; Qu, Demin – International Journal of Testing, 2015

This article explores the potential for ethnographic observations to inform the analysis of test item performance. In 2010, a standardized, large-scale adult literacy assessment took place in Mongolia as part of the United Nations Educational, Scientific and Cultural Organization Literacy Assessment and Monitoring Programme (LAMP). In a novel form…

Descriptors: Anthropology, Psychometrics, Ethnography, Adult Literacy

Exploring Differential Subgroup Functioning on SAT Writing Items: What Happens When English Is Not a Test Taker's Best Language?

Peer reviewed

Direct link

Engelhard, George, Jr.; Kobrin, Jennifer L.; Wind, Stefanie A. – International Journal of Testing, 2014

The purpose of this study is to explore patterns in model-data fit related to subgroups of test takers from a large-scale writing assessment. Using data from the SAT, a calibration group was randomly selected to represent test takers who reported that English was their best language from the total population of test takers (N = 322,011). A…

Descriptors: College Entrance Examinations, Writing Tests, Goodness of Fit, English

Test Reviewing at the Buros Center for Testing

Peer reviewed

Direct link

Carlson, Janet F.; Geisinger, Kurt F. – International Journal of Testing, 2012

The test review process used by the Buros Center for Testing is described as a series of 11 steps: (1) identifying tests to be reviewed, (2) obtaining tests and preparing test descriptions, (3) determining whether tests meet review criteria, (4) identifying appropriate reviewers, (5) selecting reviewers, (6) sending instructions and materials to…

Descriptors: Testing, Test Reviews, Evaluation Methods, Evaluation Criteria

Computerized Adaptive Testing with the Zinnes and Griggs Pairwise Preference Ideal Point Model

Peer reviewed

Direct link

Stark, Stephen; Chernyshenko, Oleksandr S. – International Journal of Testing, 2011

This article delves into a relatively unexplored area of measurement by focusing on adaptive testing with unidimensional pairwise preference items. The use of such tests is becoming more common in applied non-cognitive assessment because research suggests that this format may help to reduce certain types of rater error and response sets commonly…

Descriptors: Test Length, Simulation, Adaptive Testing, Item Analysis

Test Reviewing in Spain

Peer reviewed

Direct link

Muniz, Jose; Fernandez-Hermida, Jose R.; Fonseca-Pedrero, Eduardo; Campillo-Alvarez, Angela; Pena-Suarez, Elsa – International Journal of Testing, 2012

The proper use of psychological tests requires that the measurement instruments have adequate psychometric properties, such as reliability and validity, and that the professionals who use the instruments have the necessary expertise. In this article, we present the first review of tests published in Spain, carried out with an assessment model…

Descriptors: Student Evaluation, Measurement, Foreign Countries, Psychometrics

Previous Page | Next Page »

Pages: 1 | 2

Chernyshenko, Oleksandr S.	2
Elosua, Paula	2
Arikan, Serkan	1
Bulut, Hatice Cigdem	1
Bulut, Okan	1
Campillo-Alvarez, Angela	1
Carlson, Janet F.	1
Cassady, Jerrell C.	1
Cheong, Yuk Fai	1
Childs, Ruth A.	1
Dirkzwager, Arie	1
Engelhard, George, Jr.	1
Felipe Valentini	1
Fernandez-Hermida, Jose R.	1
Finger, Michael S.	1
Fonseca-Pedrero, Eduardo	1
Furlan, Luis Alberto	1
Geisinger, Kurt F.	1
Guenole, Nigel	1
He, Wei	1
Iliescu, Dragos	1
Ivanova, Militsa	1
Jaciw, Andrew P.	1
James Soland	1
Ki Lynn Cole	1
More ▼