ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	12

Descriptor

Data Analysis	14
Evaluation Methods	5
Item Response Theory	5
Simulation	5
Scores	4
Scoring	4
Error of Measurement	3
Monte Carlo Methods	3
Test Bias	3
Test Items	3
Testing Programs	3
Design	2
Educational Testing	2
Equated Scores	2
Equations (Mathematics)	2
Mathematics Tests	2
Reading Tests	2
Statistical Analysis	2
Test Construction	2
Testing Accommodations	2
Validity	2
Academic Achievement	1
Advanced Placement Programs	1
Bayesian Statistics	1
Behavior Problems	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	14
Reports - Research	9
Reports - Evaluative	5
Tests/Questionnaires	1

Education Level

Elementary Secondary Education

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Combining Nonparametric and Parametric Item Response Theory to Explore Data Quality: Illustrations and a Simulation Study

Peer reviewed

Direct link

Stefanie A. Wind; Benjamin Lugu – Applied Measurement in Education, 2024

Researchers who use measurement models for evaluation purposes often select models with stringent requirements, such as Rasch models, which are parametric. Mokken Scale Analysis (MSA) offers a theory-driven nonparametric modeling approach that may be more appropriate for some measurement applications. Researchers have discussed using MSA as a…

Descriptors: Item Response Theory, Data Analysis, Simulation, Nonparametric Statistics

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Prediction of Essay Scores from Writing Process and Product Features Using Data Mining Methods

Peer reviewed

Direct link

Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019

Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…

Descriptors: Scores, Prediction, Writing Processes, Data Analysis

Using the Bayes Factors to Evaluate Person Fit in the Item Response Theory

Peer reviewed

Direct link

Pan, Tianshu; Yin, Yue – Applied Measurement in Education, 2017

In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…

Descriptors: Bayesian Statistics, Goodness of Fit, Item Response Theory, Monte Carlo Methods

Bi-Factor MIRT Observed-Score Equating for Mixed-Format Tests

Peer reviewed

Direct link

Lee, Guemin; Lee, Won-Chan – Applied Measurement in Education, 2016

The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…

Descriptors: Test Format, Multidimensional Scaling, Item Response Theory, Equated Scores

The Use of Multiple Imputation for Missing Data in Uniform DIF Analysis: Power and Type I Error Rates

Peer reviewed

Direct link

Finch, Holmes – Applied Measurement in Education, 2011

Methods of uniform differential item functioning (DIF) detection have been extensively studied in the complete data case. However, less work has been done examining the performance of these methods when missing item responses are present. Research that has been done in this regard appears to indicate that treating missing item responses as…

Descriptors: Test Bias, Data Analysis, Error of Measurement

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Practical Application of a Synthetic Linking Function on Small-Sample Equating

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011

The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…

Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis

A Bootstrap Generalization of Modified Parallel Analysis for IRT Dimensionality Assessment

Peer reviewed

Direct link

Finch, Holmes; Monahan, Patrick – Applied Measurement in Education, 2008

This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA…

Descriptors: Monte Carlo Methods, Factor Analysis, Generalization, Methods

Item Position and Item Difficulty Change in an IRT-Based Common Item Equating Design

Peer reviewed

Direct link

Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009

In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…

Descriptors: Test Items, Test Content, Testing Programs, Simulation

How to Assign Individualized Scores on a Group Project: An Empirical Evaluation

Peer reviewed

Direct link

Zhang, Bo; Ohland, Matthew W. – Applied Measurement in Education, 2009

One major challenge in using group projects to assess student learning is accounting for the differences of contribution among group members so that the mark assigned to each individual actually reflects their performance. This research addresses the validity of grading group projects by evaluating different methods that derive individualized…

Descriptors: Monte Carlo Methods, Validity, Student Evaluation, Evaluation Methods

Identifying Possible Sources of Differential Functioning Using Differential Bundle Functioning with Polytomously Scored Data

Peer reviewed

Direct link

McCarty, F. A.; Oshima, T. C.; Raju, Nambury S. – Applied Measurement in Education, 2007

Oshima, Raju, Flowers, and Slinde (1998) described procedures for identifying sources of differential functioning for dichotomous data using differential bundle functioning (DBF) derived from the differential functioning of items and test (DFIT) framework (Raju, van der Linden, & Fleer, 1995). The purpose of this study was to extend the…

Descriptors: Rating Scales, Test Bias, Scoring, Test Items

Comparing DIF across Math and Reading/Language Arts Tests for Students Receiving a Read-Aloud Accommodation

Peer reviewed

Direct link

Bolt, Sara E.; Ysseldyke, James E. – Applied Measurement in Education, 2006

Although testing accommodations are commonly provided to students with disabilities within large-scale testing programs, research findings on how well accommodations allow for comparable measurement of student knowledge and skill remain inconclusive. The purpose of this study was to examine the extent to which 1 commonly held belief about testing…

Descriptors: Oral Reading, Testing Accommodations, Disabilities, Special Needs Students

Accommodations for Students With Limited English Proficiency in the National Assessment of Educational Progress

Peer reviewed

Direct link

Abedi, Jamal; Hejri, Fereshteh – Applied Measurement in Education, 2004

This study examines the effect and validity of accommodations for limited English proficiency (LEP) students in the National Assessment of Educational Progress (NAEP) and the impact of language factors on the assessment and accommodation of these students. Results indicate that accommodations used in NAEP did not reduce the performance gap between…

Descriptors: Test Items, Data Analysis, Validity, National Competency Tests

Finch, Holmes	2
Abedi, Jamal	1
Benjamin Lugu	1
Bolt, Sara E.	1
Carol Eckerly	1
Deane, Paul	1
Ercikan, Kadriye	1
Haberman, Shelby	1
Hejri, Fereshteh	1
John R. Donoghue	1
Kim, Sooyeon	1
Lee, Guemin	1
Lee, Won-Chan	1
McCarty, F. A.	1
Meyers, Jason L.	1
Miller, G. Edward	1
Monahan, Patrick	1
Ohland, Matthew W.	1
Oliveri, María Elena	1
Oshima, T. C.	1
Pan, Tianshu	1
Raju, Nambury S.	1
Sinharay, Sandip	1
Stefanie A. Wind	1
Way, Walter D.	1
More ▼