ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	27
Since 2006 (last 20 years)	56

Descriptor

Probability	76
Computation	28
Statistical Analysis	23
Models	22
Item Response Theory	16
Bayesian Statistics	15
Test Items	15
Simulation	13
Scores	12
Classification	10
Error of Measurement	10
Monte Carlo Methods	10
Markov Processes	9
Maximum Likelihood Statistics	9
Regression (Statistics)	9
Comparative Analysis	8
Statistical Bias	8
Statistical Distributions	8
Foreign Countries	7
College Students	6
Computer Software	6
Inferences	6
Sampling	6
Adaptive Testing	5
Computer Assisted Testing	5
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	76
Reports - Research	42
Reports - Evaluative	21
Reports - Descriptive	12
Guides - Non-Classroom	1

Education Level

Higher Education	10
Postsecondary Education	6
Elementary Education	4
Middle Schools	4
Elementary Secondary Education	3
Grade 8	3
Junior High Schools	3
Secondary Education	3
Grade 4	2
Grade 5	2
Early Childhood Education	1
Grade 1	1
Grade 10	1
Grade 11	1
Grade 12	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Location

Italy	2
Belgium	1
California	1
California (Los Angeles)	1
California (Riverside)	1
Indiana	1
Netherlands	1
Netherlands (Amsterdam)	1
Pennsylvania	1
Sweden	1
Texas	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	4
Center for Epidemiologic…	1
Early Childhood Longitudinal…	1
Law School Admission Test	1
National Longitudinal Study…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 76 results Save | Export

A Critical View on the NEAT Equating Design: Statistical Modeling and Identifiability Problems

Peer reviewed

Direct link

San Martín, Ernesto; González, Jorge – Journal of Educational and Behavioral Statistics, 2022

The nonequivalent groups with anchor test (NEAT) design is widely used in test equating. Under this design, two groups of examinees are administered different test forms with each test form containing a subset of common items. Because test takers from different groups are assigned only one test form, missing score data emerge by design rendering…

Descriptors: Tests, Scores, Statistical Analysis, Models

Latent Transition Cognitive Diagnosis Model with Covariates: A Three-Step Approach

Peer reviewed

Direct link

Liang, Qianru; de la Torre, Jimmy; Law, Nancy – Journal of Educational and Behavioral Statistics, 2023

To expand the use of cognitive diagnosis models (CDMs) to longitudinal assessments, this study proposes a bias-corrected three-step estimation approach for latent transition CDMs with covariates by integrating a general CDM and a latent transition model. The proposed method can be used to assess changes in attribute mastery status and attribute…

Descriptors: Cognitive Measurement, Models, Statistical Bias, Computation

What Is Actually Equated in "Test Equating"? A Didactic Note

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022

The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…

Descriptors: Equated Scores, Test Items, Scores, Probability

The Use of the Posterior Probability in Score Differencing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Journal of Educational and Behavioral Statistics, 2021

Score differencing is one of the six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…

Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis

Model Misspecification and Robustness of Observed-Score Test Equating Using Propensity Scores

Peer reviewed

Direct link

Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023

This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…

Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores

An Improved Inferential Procedure to Evaluate Item Discriminations in a Conditional Maximum Likelihood Framework

Peer reviewed

Direct link

Clemens Draxler; Andreas Kurz; Can Gürer; Jan Philipp Nolte – Journal of Educational and Behavioral Statistics, 2024

A modified and improved inductive inferential approach to evaluate item discriminations in a conditional maximum likelihood and Rasch modeling framework is suggested. The new approach involves the derivation of four hypothesis tests. It implies a linear restriction of the assumed set of probability distributions in the classical approach that…

Descriptors: Inferences, Test Items, Item Analysis, Maximum Likelihood Statistics

Conditional Subscore Reporting Using Iterated Discrete Convolutions

Peer reviewed

Direct link

Feinberg, Richard A.; von Davier, Matthias – Journal of Educational and Behavioral Statistics, 2020

The literature showing that subscores fail to add value is vast; yet despite their typical redundancy and the frequent presence of substantial statistical errors, many stakeholders remain convinced of their necessity. This article describes a method for identifying and reporting unexpectedly high or low subscores by comparing each examinee's…

Descriptors: Scores, Probability, Statistical Distributions, Ability

The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models

Peer reviewed

Direct link

Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…

Descriptors: Reliability, Probability, Skill Development, Classification

Testing the Within-State Distribution in Mixture Models for Responses and Response Times

Peer reviewed

Direct link

Kuijpers, Renske E.; Visser, Ingmar; Molenaar, Dylan – Journal of Educational and Behavioral Statistics, 2021

Mixture models have been developed to enable detection of within-subject differences in responses and response times to psychometric test items. To enable mixture modeling of both responses and response times, a distributional assumption is needed for the within-state response time distribution. Since violations of the assumed response time…

Descriptors: Test Items, Responses, Reaction Time, Models

A Scaled Threshold Model for Measuring Extreme Response Style

Peer reviewed

Direct link

Lubbe, Dirk; Schuster, Christof – Journal of Educational and Behavioral Statistics, 2020

Extreme response style is the tendency of individuals to prefer the extreme categories of a rating scale irrespective of item content. It has been shown repeatedly that individual response style differences affect the reliability and validity of item responses and should, therefore, be considered carefully. To account for extreme response style…

Descriptors: Response Style (Tests), Rating Scales, Item Response Theory, Models

Forced-Choice Ranking Models for Raters' Ranking Data

Peer reviewed

Direct link

Hung, Su-Pin; Huang, Hung-Yu – Journal of Educational and Behavioral Statistics, 2022

To address response style or bias in rating scales, forced-choice items are often used to request that respondents rank their attitudes or preferences among a limited set of options. The rating scales used by raters to render judgments on ratees' performance also contribute to rater bias or errors; consequently, forced-choice items have recently…

Descriptors: Evaluation Methods, Rating Scales, Item Analysis, Preferences

Deep Learning with TensorFlow: A Review

Peer reviewed

Direct link

Pang, Bo; Nijkamp, Erik; Wu, Ying Nian – Journal of Educational and Behavioral Statistics, 2020

This review covers the core concepts and design decisions of TensorFlow. TensorFlow, originally created by researchers at Google, is the most popular one among the plethora of deep learning libraries. In the field of deep learning, neural networks have achieved tremendous success and gained wide popularity in various areas. This family of models…

Descriptors: Artificial Intelligence, Regression (Statistics), Models, Classification

Kernel Equating Using Propensity Scores for Nonequivalent Groups

Peer reviewed

Direct link

Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019

When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…

Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations

Propensity Score Analysis with Latent Covariates: Measurement Error Bias Correction Using the Covariate's Posterior Mean, aka the "Inclusive" Factor Score

Peer reviewed

Direct link

Nguyen, Trang Quynh; Stuart, Elizabeth A. – Journal of Educational and Behavioral Statistics, 2020

We address measurement error bias in propensity score (PS) analysis due to covariates that are latent variables. In the setting where latent covariate X is measured via multiple error-prone items W, PS analysis using several proxies for X--the W items themselves, a summary score (mean/sum of the items), or the conventional factor score (i.e.,…

Descriptors: Error of Measurement, Statistical Bias, Error Correction, Probability

Estimating Heterogeneous Treatment Effects within Latent Class Multilevel Models: A Bayesian Approach

Peer reviewed

Direct link

Lyu, Weicong; Kim, Jee-Seon; Suk, Youmi – Journal of Educational and Behavioral Statistics, 2023

This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and…

Descriptors: Hierarchical Linear Modeling, Bayesian Statistics, Causal Models, Statistical Inference

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Johnson, Matthew S.	4
Sinharay, Sandip	4
van der Linden, Wim J.	4
Gelman, Andrew	3
Hong, Guanglei	3
Tipton, Elizabeth	3
Culpepper, Steven Andrew	2
De Boeck, Paul	2
Mealli, Fabrizia	2
Qin, Xu	2
Schuster, Christof	2
Smithson, Michael	2
Veldkamp, Bernard P.	2
Verkuilen, Jay	2
Wallin, Gabriel	2
Wiberg, Marie	2
Zwick, Rebecca	2
Andreas Kurz	1
Andrew Gelman	1
Bartolucci, Francesco	1
Becker, Kirsten	1
Bradlow, Eric T.	1
Brennan, Robert L.	1
Camparo, James	1
More ▼