ERIC - Search Results

Publication Date

In 2025	2
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	12

Source

Educational Measurement:…

Publication Type

Journal Articles	26
Reports - Research	26
Information Analyses	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Middle Schools	1
Secondary Education	1

Audience

Location

Arizona	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Program for the International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Linking Unlinkable Tests: A Step Forward

Peer reviewed

Direct link

Silvia Testa; Renato Miceli; Renato Miceli – Educational Measurement: Issues and Practice, 2025

Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE--that may only be used when the individuals taking the two tests come from the same…

Descriptors: Comparative Testing, Heuristics, Problem Solving, Personality Traits

Measurement Invariance for Multilingual Learners Using Item Response and Response Time in PISA 2018

Peer reviewed

Direct link

Jung Yeon Park; Sean Joo; Zikun Li; Hyejin Yoon – Educational Measurement: Issues and Practice, 2025

This study examines potential assessment bias based on students' primary language status in PISA 2018. Specifically, multilingual (MLs) and nonmultilingual (non-MLs) students in the United States are compared with regard to their response time as well as scored responses across three cognitive domains (reading, mathematics, and science).…

Descriptors: Achievement Tests, Secondary School Students, International Assessment, Test Bias

Combining Process Information and Item Response Modeling to Estimate Problem-Solving Ability

Peer reviewed

Direct link

Xiao, Yue; Veldkamp, Bernard; Liu, Hongyun – Educational Measurement: Issues and Practice, 2022

The action sequences of respondents in problem-solving tasks reflect rich and detailed information about their performance, including differences in problem-solving ability, even if item scores are equal. It is therefore not sufficient to infer individual problem-solving skills based solely on item scores. This study is a preliminary attempt to…

Descriptors: Problem Solving, Item Response Theory, Scores, Item Analysis

Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022

Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…

Descriptors: Ability, Tests, Equated Scores, Testing Problems

Using Diagnostic Profiles to Describe Borderline Performance in Standard Setting

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F.; Wilkins, Jesse L. M. – Educational Measurement: Issues and Practice, 2020

In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can…

Descriptors: Standard Setting (Scoring), Cutting Scores, Testing Problems, Profiles

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Affordances of Item Formats and Their Effects on Test-Taker Cognition under Uncertainty

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019

The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…

Descriptors: Affordances, Test Items, Test Format, Test Wiseness

Detecting Measurement Disturbances in Rater-Mediated Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Schumacker, Randall E. – Educational Measurement: Issues and Practice, 2017

The term measurement disturbance has been used to describe systematic conditions that affect a measurement process, resulting in a compromised interpretation of person or item estimates. Measurement disturbances have been discussed in relation to systematic response patterns associated with items and persons, such as start-up, plodding, boredom,…

Descriptors: Measurement, Testing Problems, Writing Tests, Performance Based Assessment

Examining Estimates of Intervention Effectiveness Using Sensitivity Analysis

Peer reviewed

Direct link

An, Chen; Braun, Henry; Walsh, Mary E. – Educational Measurement: Issues and Practice, 2018

Making causal inferences from a quasi-experiment is difficult. Sensitivity analysis approaches to address hidden selection bias thus have gained popularity. This study serves as an introduction to a simple but practical form of sensitivity analysis using Monte Carlo simulation procedures. We examine estimated treatment effects for a school-based…

Descriptors: Statistical Inference, Intervention, Program Effectiveness, Quasiexperimental Design

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Test Development with Performance Standards and Achievement Growth in Mind

Peer reviewed

Direct link

Ferrara, Steve; Svetina, Dubravka; Skucha, Sylvia; Davidson, Anne H. – Educational Measurement: Issues and Practice, 2011

Items on test score scales located at and below the Proficient cut score define the content area knowledge and skills required to achieve proficiency. Alternately, examinees who perform at the Proficient level on a test can be expected to be able to demonstrate that they have mastered most of the knowledge and skills represented by the items at…

Descriptors: Knowledge Level, Mathematics Tests, Program Effectiveness, Inferences

Formative Assessment: A Meta-Analysis and a Call for Research

Peer reviewed

Direct link

Kingston, Neal; Nash, Brooke – Educational Measurement: Issues and Practice, 2011

An effect size of about 0.70 (or 0.40-0.70) is often claimed for the efficacy of formative assessment, but is not supported by the existing research base. More than 300 studies that appeared to address the efficacy of formative assessment in grades K-12 were reviewed. Many of the studies had severely flawed research designs yielding…

Descriptors: Elementary Secondary Education, Formative Evaluation, Program Effectiveness, Effect Size

Variables in Eliciting Writing Samples.

Peer reviewed

Moran, Mary Ross; And Others – Educational Measurement: Issues and Practice, 1991

Practices identified by experts as critical variables in eliciting writing samples were checked against 12 randomly selected studies using holistic ratings to derive descriptions of inferential statistical results for described samples. The studies often lacked precise information about these variables, limiting understanding of writing evaluation…

Descriptors: Cues, Educational Practices, Examiners, Holistic Evaluation

Limitations of Standard Scores in Individual Achievement Testing.

Peer reviewed

Phillips, S. E.; Clarizio, Harvey F. – Educational Measurement: Issues and Practice, 1988

Two major problems related to the identification of learning disabilities with individually administered achievement tests are discussed: (1) the appropriateness of standard versus developmental scores for determining the severity of discrepancy; and (2) the limitations of existing developmental score scales. Characteristics of the developmental…

Descriptors: Achievement Tests, Diagnostic Tests, Learning Disabilities, Scores

Exemplary LEA Practice: The Great Pencil Panic of 1984.

Peer reviewed

Bauer, Ernest A. – Educational Measurement: Issues and Practice, 1985

Misreadings of pencil answer marks on test answer sheets by optical scanners cause scoring errors. Twenty-six different pencils were tested for readability differences when optically scanned. Complete light and dark marks scanned perfectly for 18 pencils. Totals for all six mark types ranged from 947 to 1726 out of 1800. (BS)

Descriptors: Answer Sheets, Elementary Secondary Education, Error of Measurement, Optical Scanners

Previous Page | Next Page »

Pages: 1 | 2

Testing Problems	18
Achievement Tests	7
Elementary Secondary Education	7
Scores	7
Test Items	7
Test Use	6
Norm Referenced Tests	5
Standardized Tests	5
Test Validity	5
Evaluation Methods	4
Problem Solving	4
State Programs	4
Student Evaluation	4
Test Interpretation	4
Testing Programs	4
Diagnostic Tests	3
Educational Assessment	3
Elementary Education	3
Elementary School Teachers	3
Mathematics Tests	3
Measurement	3
National Norms	3
Program Effectiveness	3
Research Problems	3
State Surveys	3
More ▼

Abedi, Jamal	1
An, Chen	1
Babcock, Ben	1
Baker, Eva	1
Bauer, Ernest A.	1
Braun, Henry	1
Cannell, John Jacob	1
Carter, Kathy	1
Childs, Ruth A.	1
Clarizio, Harvey F.	1
Davidson, Anne H.	1
Ellwein, Mary C.	1
Ferrara, Steve	1
Gattamorta, Karina	1
Gullickson, Arlen R.	1
Hall, Bruce W.	1
Hall, Janie L.	1
Hein, Serge F.	1
Hofstetter, Carolyn	1
Hyejin Yoon	1
Jung Yeon Park	1
Katz, Irvin R.	1
Keehner, Madeleine	1
Kim, Sooyeon	1
More ▼