ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	31
Since 2006 (last 20 years)	46

Source

Applied Measurement in…

Publication Type

Journal Articles	57
Reports - Research	57
Tests/Questionnaires	4
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	16
Elementary Education	11
Elementary Secondary Education	11
Higher Education	7
Postsecondary Education	6
Grade 3	4
Grade 8	4
Junior High Schools	4
Middle Schools	4
Grade 4	3
High Schools	3
Intermediate Grades	3
Grade 6	2
Early Childhood Education	1
Grade 1	1
Grade 11	1
Grade 12	1
Grade 2	1
Grade 5	1
Grade 7	1
Grade 9	1
Primary Education	1
More ▼

Audience

Location

Canada	14
Netherlands	6
Australia	5
Germany	4
Israel	4
United States	3
Belgium	2
Finland	2
Iran	2
Iran (Tehran)	2
Singapore	2
Spain	2
Costa Rica	1
Europe	1
France	1
Italy	1
Japan	1
Jordan	1
Norway	1
Oman	1
Romania	1
Russia	1
Slovenia	1
South Korea	1
Sweden	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	15
Trends in International…	7
National Assessment of…	1
Perceived Competence Scale…	1
Progress in International…	1
Test Anxiety Inventory	1

What Works Clearinghouse Rating

Showing 1 to 15 of 57 results Save | Export

Cross-Cultural Validation of the Mathematics Construct and Attribute Profiles: A Differential Item Functioning Approach

Peer reviewed

Direct link

Yi-Hsin Chen – Applied Measurement in Education, 2024

This study aims to apply the differential item functioning (DIF) technique with the deterministic inputs, noisy "and" gate (DINA) model to validate the mathematics construct and diagnostic attribute profiles across American and Singaporean students. Even with the same ability level, every single item is expected to show uniform DIF…

Descriptors: Foreign Countries, Achievement Tests, Elementary Secondary Education, International Assessment

Coefficient [beta] as Extension of KR-21 Reliability for Summed and Scaled Scores for Polytomously-Scored Tests

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Measurement in Education, 2021

KR-21 reliability and its extension (coefficient [alpha]) gives the reliability estimate of test scores under the assumption of tau-equivalent forms. KR-21 reliability gives the reliability estimate for summed scores for dichotomous items when items are randomly sampled from an infinite pool of similar items (randomly parallel forms). The article…

Descriptors: Test Reliability, Scores, Scoring, Computation

Exploring Interrelationships among L2 Writing Subskills: Insights from Cognitive Diagnostic Models

Peer reviewed

Direct link

Hamdollah Ravand; Farshad Effatpanah; Wenchao Ma; Jimmy de la Torre; Purya Baghaei; Olga Kunina-Habenicht – Applied Measurement in Education, 2024

The purpose of this study was to explore the nature of interactions among second/foreign language (L2) writing subskills. Two types of relationships were investigated: subskill-item and subskill-subskill relationships. To achieve the first purpose, using writing data obtained from the writing essays of 500 English as a foreign language (EFL)…

Descriptors: Second Language Learning, Writing Instruction, Writing Skills, Writing Tests

Not-Reached Items: An Issue of Time and of Test-Taking Disengagement? The Case of PISA 2015 Reading Data

Peer reviewed

Direct link

Pools, Elodie – Applied Measurement in Education, 2022

Many low-stakes assessments, such as international large-scale surveys, are administered during time-limited testing sessions and some test-takers are not able to endorse the last items of the test, resulting in not-reached (NR) items. However, because the test has no consequence for the respondents, these NR items can also stem from quitting the…

Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination

Peer reviewed

Direct link

Mehrazmay, Roghayeh; Ghonsooly, Behzad; de la Torre, Jimmy – Applied Measurement in Education, 2021

The present study aims to examine gender differential item functioning (DIF) in the reading comprehension section of a high stakes test using cognitive diagnosis models. Based on the multiple-group generalized deterministic, noisy "and" gate (MG G-DINA) model, the Wald test and likelihood ratio test are used to detect DIF. The flagged…

Descriptors: Test Bias, College Entrance Examinations, Gender Differences, Reading Tests

Can Culture Be a Salient Predictor of Test-Taking Engagement? An Analysis of Differential Noneffortful Responding on an International College-Level Assessment of Critical Thinking

Peer reviewed

Direct link

Rios, Joseph A.; Guo, Hongwen – Applied Measurement in Education, 2020

The objective of this study was to evaluate whether differential noneffortful responding (identified via response latencies) was present in four countries administered a low-stakes college-level critical thinking assessment. Results indicated significant differences (as large as 0.90 "SD") between nearly all country pairings in the…

Descriptors: Response Style (Tests), Cultural Differences, Critical Thinking, Cognitive Tests

Computer-Based Listening Test with Full Video, Visual-Limited Video, and Audio: A Comparative Analysis Based on Difficulty, Discrimination Power, and Response Time

Peer reviewed

Direct link

Takahiro Terao – Applied Measurement in Education, 2024

This study aimed to compare item characteristics and response time between stimulus conditions in computer-delivered listening tests. Listening materials had three variants: regular videos, frame-by-frame videos, and only audios without visuals. Participants were 228 Japanese high school students who were requested to complete one of nine…

Descriptors: Computer Assisted Testing, Audiovisual Aids, Reaction Time, High School Students

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

Peer reviewed

Direct link

Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017

Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

Descriptors: Error of Measurement, Test Bias, International Assessment, Computation

Foundations of Formative Assessment: Introducing a Learning Progression to Guide Preservice Physics Teachers' Video-Based Interpretation of Student Thinking

Peer reviewed

Direct link

von Aufschnaiter, Claudia; Alonzo, Alicia C. – Applied Measurement in Education, 2018

Establishing nuanced interpretations of student thinking is central to formative assessment but difficult, especially for preservice teachers. Learning progressions (LPs) have been proposed as a framework for promoting interpretations of students' thinking; however, research is needed to investigate whether and how an LP can be used to support…

Descriptors: Formative Evaluation, Preservice Teachers, Physics, Science Instruction

Of Small Beauties and Large Beasts: The Quality of Distractors on Multiple-Choice Tests Is More Important than Their Quantity

Peer reviewed

Direct link

Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017

In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…

Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Analyzing Fairness among Linguistic Minority Populations Using a Latent Class Differential Item Functioning Approach

Peer reviewed

Direct link

Oliveri, Maria Elena; Ercikan, Kadriye; Lyons-Thomas, Juliette; Holtzman, Steven – Applied Measurement in Education, 2016

Differential item functioning (DIF) analyses have been used as the primary method in large-scale assessments to examine fairness for subgroups. Currently, DIF analyses are conducted utilizing manifest methods using observed characteristics (gender and race/ethnicity) for grouping examinees. Homogeneity of item responses is assumed denoting that…

Descriptors: Test Bias, Language Minorities, Effect Size, Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Ercikan, Kadriye	4
Andrich, David	2
Byrne, Barbara M.	2
Hambleton, Ronald K.	2
Heldsinger, Sandra	2
Hickendorff, Marian	2
Janssen, Rianne	2
Lyons-Thomas, Juliette	2
Rios, Joseph A.	2
Rogers, W. Todd	2
Sireci, Stephen G.	2
Abu-Ghazalah, Rashid M.	1
Abulela, Mohammed A. A.	1
Ainley, John	1
Allalouf, Avi	1
Almehrizi, Rashid S.	1
Alonzo, Alicia C.	1
Alves, Cecilia B.	1
Andersen, Øistein E.	1
Azen, Razia	1
Bahry, Louise M.	1
Bateson, David J.	1
Bazargan, Abbas	1
Ben-Simon, Anat	1
Benítez, Isabel	1
More ▼

Foreign Countries	57
Test Items	21
International Assessment	17
Achievement Tests	13
Mathematics Tests	13
Test Bias	13
Secondary School Students	11
Comparative Analysis	10
Scores	10
Correlation	8
Item Response Theory	8
Computer Assisted Testing	7
Elementary School Students	7
High School Students	7
Item Analysis	7
Models	7
Test Validity	7
Difficulty Level	6
Mathematics Achievement	6
Reading Tests	6
Statistical Analysis	6
Test Construction	6
Computation	5
Elementary Secondary Education	5
Measurement	5
More ▼