ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	23

Source

Journal of Educational and…

Publication Type

Journal Articles	23
Reports - Research	14
Reports - Descriptive	5
Reports - Evaluative	4

Education Level

Elementary Education	6
Early Childhood Education	3
Grade 4	3
Grade 6	3
Grade 3	2
Grade 5	2
Grade 7	2
High Schools	2
Intermediate Grades	2
Elementary Secondary Education	1
Grade 8	1
Higher Education	1
Kindergarten	1
Middle Schools	1
Postsecondary Education	1
Preschool Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

New York

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Early Childhood Longitudinal…	1
National Longitudinal Study…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

A Critical View on the NEAT Equating Design: Statistical Modeling and Identifiability Problems

Peer reviewed

Direct link

San Martín, Ernesto; González, Jorge – Journal of Educational and Behavioral Statistics, 2022

The nonequivalent groups with anchor test (NEAT) design is widely used in test equating. Under this design, two groups of examinees are administered different test forms with each test form containing a subset of common items. Because test takers from different groups are assigned only one test form, missing score data emerge by design rendering…

Descriptors: Tests, Scores, Statistical Analysis, Models

On the Generalized S-X[superscript 2]-Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization

Peer reviewed

Direct link

Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022

The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…

Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Using Pooled Heteroskedastic Ordered Probit Models to Improve Small-Sample Estimates of Latent Test Score Distributions

Peer reviewed
PDF on ERIC

Download full text

Direct link

Shear, Benjamin R.; Reardon, Sean F. – Journal of Educational and Behavioral Statistics, 2021

This article describes an extension to the use of heteroskedastic ordered probit (HETOP) models to estimate latent distributional parameters from grouped, ordered-categorical data by pooling across multiple waves of data. We illustrate the method with aggregate proficiency data reporting the number of students in schools or districts scoring in…

Descriptors: Statistical Analysis, Computation, Regression (Statistics), Sample Size

Propensity Score Analysis with Latent Covariates: Measurement Error Bias Correction Using the Covariate's Posterior Mean, aka the "Inclusive" Factor Score

Peer reviewed

Direct link

Nguyen, Trang Quynh; Stuart, Elizabeth A. – Journal of Educational and Behavioral Statistics, 2020

We address measurement error bias in propensity score (PS) analysis due to covariates that are latent variables. In the setting where latent covariate X is measured via multiple error-prone items W, PS analysis using several proxies for X--the W items themselves, a summary score (mean/sum of the items), or the conventional factor score (i.e.,…

Descriptors: Error of Measurement, Statistical Bias, Error Correction, Probability

A Robust Test for Checking the Homogeneity of Variability Measures and Its Application to the Analysis of Implicit Attitudes

Peer reviewed

Direct link

Erps, Ryan C.; Noguchi, Kimihiro – Journal of Educational and Behavioral Statistics, 2020

A new two-sample test for comparing variability measures is proposed. To make the test robust and powerful, a new modified structural zero removal method is applied to the Brown-Forsythe transformation. The t-test-based statistic allows results to be expressed as the ratio of mean absolute deviations from median. Extensive simulation study…

Descriptors: Statistical Analysis, Comparative Analysis, Robustness (Statistics), Sample Size

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Principal Score Methods: Assumptions, Extensions, and Practical Considerations

Peer reviewed
PDF on ERIC

Download full text

Direct link

Feller, Avi; Mealli, Fabrizia; Miratrix, Luke – Journal of Educational and Behavioral Statistics, 2017

Researchers addressing posttreatment complications in randomized trials often turn to principal stratification to define relevant assumptions and quantities of interest. One approach for the subsequent estimation of causal effects in this framework is to use methods based on the "principal score," the conditional probability of belonging…

Descriptors: Scores, Probability, Computation, Program Evaluation

Using Heteroskedastic Ordered Probit Models to Recover Moments of Continuous Test Score Distributions from Coarsened Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Reardon, Sean F.; Shear, Benjamin R.; Castellano, Katherine E.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2017

Test score distributions of schools or demographic groups are often summarized by frequencies of students scoring in a small number of ordered proficiency categories. We show that heteroskedastic ordered probit (HETOP) models can be used to estimate means and standard deviations of multiple groups' test score distributions from such data. Because…

Descriptors: Scores, Statistical Analysis, Models, Computation

Interval Estimation of Latent Variable Scores in Item Response Theory

Peer reviewed

Direct link

Liu, Yang; Yang, Ji Seung – Journal of Educational and Behavioral Statistics, 2018

The uncertainty arising from item parameter estimation is often not negligible and must be accounted for when calculating latent variable (LV) scores in item response theory (IRT). It is particularly so when the calibration sample size is limited and/or the calibration IRT model is complex. In the current work, we treat two-stage IRT scoring as a…

Descriptors: Intervals, Scores, Item Response Theory, Bayesian Statistics

Detection of Differential Item Functioning Using the Lasso Approach

Peer reviewed

Direct link

Magis, David; Tuerlinckx, Francis; De Boeck, Paul – Journal of Educational and Behavioral Statistics, 2015

This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

Descriptors: Test Bias, Test Items, Regression (Statistics), Scores

An Aggregate IRT Procedure for Exploratory Factor Analysis

Peer reviewed

Direct link

Camilli, Gregory; Fox, Jean-Paul – Journal of Educational and Behavioral Statistics, 2015

An aggregation strategy is proposed to potentially address practical limitation related to computing resources for two-level multidimensional item response theory (MIRT) models with large data sets. The aggregate model is derived by integration of the normal ogive model, and an adaptation of the stochastic approximation expectation maximization…

Descriptors: Factor Analysis, Item Response Theory, Grade 4, Simulation

Correcting for Test Score Measurement Error in ANCOVA Models for Estimating Treatment Effects

Peer reviewed

Direct link

Lockwood, J. R.; McCaffrey, Daniel F. – Journal of Educational and Behavioral Statistics, 2014

A common strategy for estimating treatment effects in observational studies using individual student-level data is analysis of covariance (ANCOVA) or hierarchical variants of it, in which outcomes (often standardized test scores) are regressed on pretreatment test scores, other student characteristics, and treatment group indicators. Measurement…

Descriptors: Error of Measurement, Scores, Statistical Analysis, Computation

The Gains from Vertical Scaling

Peer reviewed

Direct link

Briggs, Derek C.; Domingue, Ben – Journal of Educational and Behavioral Statistics, 2013

It is often assumed that a vertical scale is necessary when value-added models depend upon the gain scores of students across two or more points in time. This article examines the conditions under which the scale transformations associated with the vertical scaling process would be expected to have a significant impact on normative interpretations…

Descriptors: Evaluation Methods, Scaling, Scores, Achievement Tests

Peer reviewed

Direct link

Karl, Andrew T.; Yang, Yan; Lohr, Sharon L. – Journal of Educational and Behavioral Statistics, 2013

Value-added models have been widely used to assess the contributions of individual teachers and schools to students' academic growth based on longitudinal student achievement outcomes. There is concern, however, that ignoring the presence of missing values, which are common in longitudinal studies, can bias teachers' value-added scores.…

Descriptors: Evaluation Methods, Teacher Effectiveness, Academic Achievement, Achievement Gains

Previous Page | Next Page »

Pages: 1 | 2

Scores	23
Statistical Analysis	23
Computation	9
Models	7
Achievement Tests	6
Regression (Statistics)	6
Sample Size	6
Equations (Mathematics)	5
Error of Measurement	5
Item Response Theory	5
Longitudinal Studies	5
Bayesian Statistics	4
Correlation	4
Mathematics Achievement	4
Statistical Bias	4
Statistical Inference	4
Teacher Effectiveness	4
Test Items	4
Educational Research	3
Evaluation Methods	3
Probability	3
Statistical Distributions	3
Academic Achievement	2
Accountability	2
Achievement Gains	2
More ▼

Schochet, Peter Z.	3
Lockwood, J. R.	2
Reardon, Sean F.	2
Shear, Benjamin R.	2
Bauer, Daniel J.	1
Bezirhan, Ummugul	1
Boyd, Donald	1
Brauer, Kay	1
Briggs, Derek C.	1
Cai, Li	1
Camilli, Gregory	1
Castellano, Katherine E.	1
De Boeck, Paul	1
Demirkaya, Onur	1
Domingue, Ben	1
Doran, Harold C.	1
Erps, Ryan C.	1
Feller, Avi	1
Fox, Jean-Paul	1
González, Jorge	1
Ho, Andrew D.	1
Hong, Guanglei	1
Karl, Andrew T.	1
Kolstad, Andrew	1
Lankford, Hamilton	1
More ▼