ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	15

Descriptor

Statistical Analysis	18
Item Response Theory	14
Simulation	8
Test Items	7
Comparative Analysis	5
Response Style (Tests)	5
Sample Size	5
Computation	4
Evaluation Methods	4
Models	4
Scores	4
Achievement Tests	3
Goodness of Fit	3
Testing Problems	3
Accuracy	2
Bayesian Statistics	2
Classification	2
Computer Assisted Testing	2
Effect Size	2
Error of Measurement	2
Identification	2
Measurement Techniques	2
Regression (Statistics)	2
Responses	2
Robustness (Statistics)	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	16
Reports - Research	12
Reports - Evaluative	3
Reports - Descriptive	1

Education Level

Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Indiana Statewide Testing for…	2
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

A Unified Comparison of IRT-Based Effect Sizes for DIF Investigations

Peer reviewed

Direct link

Chalmers, R. Philip – Journal of Educational Measurement, 2023

Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and…

Descriptors: Test Bias, Item Response Theory, Definitions, Monte Carlo Methods

Statistical Theory and Assessment Practice

Peer reviewed

Direct link

Haberman, Shelby J. – Journal of Educational Measurement, 2020

Examples of the impact of statistical theory on assessment practice are provided from the perspective of a statistician trained in theoretical statistics who began to work on assessments. Goodness of fit of item-response models is examined in terms of restricted likelihood-ratio tests and generalized residuals. Minimum discriminant information…

Descriptors: Statistics, Goodness of Fit, Item Response Theory, Statistical Analysis

Modeling Skipped and Not-Reached Items Using IRTrees

Peer reviewed

Direct link

Debeer, Dries; Janssen, Rianne; De Boeck, Paul – Journal of Educational Measurement, 2017

When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree-based IRT framework for modeling responses and omissions…

Descriptors: Item Response Theory, Test Items, Responses, Testing Problems

Estimating the Accuracy of Relative Growth Measures Using Empirical Data

Peer reviewed

Direct link

Castellano, Katherine E.; McCaffrey, Daniel F. – Journal of Educational Measurement, 2020

The residual gain score has been of historical interest, and its percentile rank has been of interest more recently given its close correspondence to the popular Student Growth Percentile. However, these estimators suffer from low accuracy and systematic bias (bias conditional on prior latent achievement). This article explores three…

Descriptors: Accuracy, Student Evaluation, Measurement Techniques, Evaluation Methods

Detecting Item Drift in Large-Scale Testing

Peer reviewed

Direct link

Guo, Hongwen; Robin, Frederic; Dorans, Neil – Journal of Educational Measurement, 2017

The early detection of item drift is an important issue for frequently administered testing programs because items are reused over time. Unfortunately, operational data tend to be very sparse and do not lend themselves to frequent monitoring analyses, particularly for on-demand testing. Building on existing residual analyses, the authors propose…

Descriptors: Testing, Test Items, Identification, Sample Size

Robust Detection of Examinees with Aberrant Answer Changes

Peer reviewed

Direct link

Belov, Dmitry I. – Journal of Educational Measurement, 2015

The statistical analysis of answer changes (ACs) has uncovered multiple testing irregularities on large-scale assessments and is now routinely performed at testing organizations. However, AC data has an uncertainty caused by technological or human factors. Therefore, existing statistics (e.g., number of wrong-to-right ACs) used to detect examinees…

Descriptors: Statistical Analysis, Robustness (Statistics), Identification, Test Items

A Comparison of IRT Proficiency Estimation Methods under Adaptive Multistage Testing

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim; Yoo, Hanwook – Journal of Educational Measurement, 2015

This inquiry is an investigation of item response theory (IRT) proficiency estimators' accuracy under multistage testing (MST). We chose a two-stage MST design that includes four modules (one at Stage 1, three at Stage 2) and three difficulty paths (low, middle, high). We assembled various two-stage MST panels (i.e., forms) by manipulating two…

Descriptors: Comparative Analysis, Item Response Theory, Computation, Accuracy

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Assessing Individual-Level Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015

With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis

A Comparison of Item Calibration Procedures in the Presence of Test Speededness

Peer reviewed

Direct link

Suh, Youngsuk; Cho, Sun-Joo; Wollack, James A. – Journal of Educational Measurement, 2012

In the presence of test speededness, the parameter estimates of item response theory models can be poorly estimated due to conditional dependencies among items, particularly for end-of-test items (i.e., speeded items). This article conducted a systematic comparison of five-item calibration procedures--a two-parameter logistic (2PL) model, a…

Descriptors: Response Style (Tests), Timed Tests, Test Items, Item Response Theory

Determining the Overall Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014

With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)

Detection of Invalid Test Scores: The Usefulness of Simple Nonparametric Statistics

Peer reviewed

Direct link

Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014

In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…

Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis

A Comparison between Linear IRT Observed-Score Equating and Levine Observed-Score Equating under the Generalized Kernel Equating Framework

Peer reviewed

Direct link

Chen, Haiwen – Journal of Educational Measurement, 2012

In this article, linear item response theory (IRT) observed-score equating is compared under a generalized kernel equating framework with Levine observed-score equating for nonequivalent groups with anchor test design. Interestingly, these two equating methods are closely related despite being based on different methodologies. Specifically, when…

Descriptors: Tests, Item Response Theory, Equated Scores, Statistical Analysis

Assessing Fit of Item Response Models Using the Information Matrix Test

Peer reviewed

Direct link

Ranger, Jochen; Kuhn, Jorg-Tobias – Journal of Educational Measurement, 2012

The information matrix can equivalently be determined via the expectation of the Hessian matrix or the expectation of the outer product of the score vector. The identity of these two matrices, however, is only valid in case of a correctly specified model. Therefore, differences between the two versions of the observed information matrix indicate…

Descriptors: Goodness of Fit, Item Response Theory, Models, Matrices

Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model

Peer reviewed

Direct link

de la Torre, Jimmy; Hong, Yuan; Deng, Weiling – Journal of Educational Measurement, 2010

To better understand the statistical properties of the deterministic inputs, noisy "and" gate cognitive diagnosis (DINA) model, the impact of several factors on the quality of the item parameter estimates and classification accuracy was investigated. Results of the simulation study indicate that the fully Bayes approach is most accurate when the…

Descriptors: Classification, Computation, Models, Simulation

Previous Page | Next Page »

Pages: 1 | 2

Choi, Seung W.	2
Kim, Dong-In	2
Sinharay, Sandip	2
Suh, Youngsuk	2
Wan, Ping	2
Ankenmann, Robert D.	1
Belov, Dmitry I.	1
Castellano, Katherine E.	1
Chalmers, R. Philip	1
Chen, Haiwen	1
Cho, Sun-Joo	1
De Boeck, Paul	1
Debeer, Dries	1
Deng, Weiling	1
Dorans, Neil	1
Guo, Hongwen	1
Haberman, Shelby J.	1
Hong, Yuan	1
Janssen, Rianne	1
Kim, Sooyeon	1
Kuhn, Jorg-Tobias	1
McCaffrey, Daniel F.	1
Meijer, Rob R.	1
Monahan, Patrick O.	1
More ▼