Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 37 |
Descriptor
Computation | 42 |
Statistical Analysis | 14 |
Comparative Analysis | 11 |
Error of Measurement | 10 |
Models | 9 |
Regression (Statistics) | 9 |
Scores | 7 |
Correlation | 6 |
Maximum Likelihood Statistics | 6 |
Monte Carlo Methods | 6 |
Probability | 6 |
More ▼ |
Source
Journal of Educational and… | 42 |
Author
Castellano, Katherine E. | 2 |
Moses, Tim | 2 |
Reardon, Sean F. | 2 |
Sinharay, Sandip | 2 |
Ahn, Soyeon | 1 |
Avi Feller | 1 |
Baram, Tallie Z. | 1 |
Becker, Betsy Jane | 1 |
Bellara, Aarti | 1 |
Benjamin Lu | 1 |
Berkhof, Johannes | 1 |
More ▼ |
Publication Type
Journal Articles | 42 |
Reports - Evaluative | 42 |
Book/Product Reviews | 1 |
Education Level
Elementary Education | 5 |
Grade 1 | 2 |
Grade 4 | 2 |
Intermediate Grades | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Adult Education | 1 |
Early Childhood Education | 1 |
Grade 2 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 3 |
Program for International… | 2 |
Hopkins Symptom Checklist | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Bonett, Douglas G.; Price, Robert M., Jr. – Journal of Educational and Behavioral Statistics, 2020
In studies where the response variable is measured on a ratio scale, a ratio of means or medians provides a standardized measure of effect size that is an alternative to the popular standardized mean difference. Confidence intervals for ratios of population means and medians in independent-samples designs and paired-samples designs are proposed as…
Descriptors: Computation, Statistical Analysis, Mathematical Concepts, Effect Size
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2022
Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on…
Descriptors: Computation, Data Analysis, Educational Testing, Accuracy
Ramsay, James; Wiberg, Marie; Li, Juan – Journal of Educational and Behavioral Statistics, 2020
Ramsay and Wiberg used a new version of item response theory that represents test performance over nonnegative closed intervals such as [0, 100] or [0, n] and demonstrated that optimal scoring of binary test data yielded substantial improvements in point-wise root-mean-squared error and bias over number right or sum scoring. We extend these…
Descriptors: Scoring, Weighted Scores, Item Response Theory, Intervals
The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models
Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020
One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…
Descriptors: Reliability, Probability, Skill Development, Classification
Benjamin Lu; Eli Ben-Michael; Avi Feller; Luke Miratrix – Journal of Educational and Behavioral Statistics, 2023
In multisite trials, learning about treatment effect variation across sites is critical for understanding where and for whom a program works. Unadjusted comparisons, however, capture "compositional" differences in the distributions of unit-level features as well as "contextual" differences in site-level features, including…
Descriptors: Statistical Analysis, Statistical Distributions, Program Implementation, Comparative Analysis
Gao, Xuliang; Ma, Wenchao; Wang, Daxun; Cai, Yan; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2021
This article proposes a class of cognitive diagnosis models (CDMs) for polytomously scored items with different link functions. Many existing polytomous CDMs can be considered as special cases of the proposed class of polytomous CDMs. Simulation studies were carried out to investigate the feasibility of the proposed CDMs and the performance of…
Descriptors: Cognitive Measurement, Models, Test Items, Scoring
Schochet, Peter Z. – Journal of Educational and Behavioral Statistics, 2020
This article discusses estimation of average treatment effects for randomized controlled trials (RCTs) using grouped administrative data to help improve data access. The focus is on design-based estimators, derived using the building blocks of experiments, that are conducive to grouped data for a wide range of RCT designs, including clustered and…
Descriptors: Randomized Controlled Trials, Data Analysis, Research Design, Multivariate Analysis
van der Linden, Wim J.; Ren, Hao – Journal of Educational and Behavioral Statistics, 2020
The Bayesian way of accounting for the effects of error in the ability and item parameters in adaptive testing is through the joint posterior distribution of all parameters. An optimized Markov chain Monte Carlo algorithm for adaptive testing is presented, which samples this distribution in real time to score the examinee's ability and optimally…
Descriptors: Bayesian Statistics, Adaptive Testing, Error of Measurement, Markov Processes
Jewsbury, Paul A.; van Rijn, Peter W. – Journal of Educational and Behavioral Statistics, 2020
In large-scale educational assessment data consistent with a simple-structure multidimensional item response theory (MIRT) model, where every item measures only one latent variable, separate unidimensional item response theory (UIRT) models for each latent variable are often calibrated for practical reasons. While this approach can be valid for…
Descriptors: Item Response Theory, Computation, Test Items, Adaptive Testing
Park, Soojin; Palardy, Gregory J. – Journal of Educational and Behavioral Statistics, 2020
Estimating the effects of randomized experiments and, by extension, their mediating mechanisms, is often complicated by treatment noncompliance. Two estimation methods for causal mediation in the presence of noncompliance have recently been proposed, the instrumental variable method (IV-mediate) and maximum likelihood method (ML-mediate). However,…
Descriptors: Computation, Compliance (Psychology), Maximum Likelihood Statistics, Statistical Analysis
Vegetabile, Brian G.; Stout-Oswald, Stephanie A.; Davis, Elysia Poggi; Baram, Tallie Z.; Stern, Hal S. – Journal of Educational and Behavioral Statistics, 2019
Predictability of behavior is an important characteristic in many fields including biology, medicine, marketing, and education. When a sequence of actions performed by an individual can be modeled as a stationary time-homogeneous Markov chain the predictability of the individual's behavior can be quantified by the entropy rate of the process. This…
Descriptors: Markov Processes, Prediction, Behavior, Computation
Reardon, Sean F.; Shear, Benjamin R.; Castellano, Katherine E.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2017
Test score distributions of schools or demographic groups are often summarized by frequencies of students scoring in a small number of ordered proficiency categories. We show that heteroskedastic ordered probit (HETOP) models can be used to estimate means and standard deviations of multiple groups' test score distributions from such data. Because…
Descriptors: Scores, Statistical Analysis, Models, Computation
Thoemmes, Felix; Liao, Wang; Jin, Ze – Journal of Educational and Behavioral Statistics, 2017
This article describes the analysis of regression-discontinuity designs (RDDs) using the R packages rdd, rdrobust, and rddtools. We discuss similarities and differences between these packages and provide directions on how to use them effectively. We use real data from the Carolina Abecedarian Project to show how an analysis of an RDD can be…
Descriptors: Regression (Statistics), Research Design, Robustness (Statistics), Computer Software
McCoach, D. Betsy; Rifenbark, Graham G.; Newton, Sarah D.; Li, Xiaoran; Kooken, Janice; Yomtov, Dani; Gambino, Anthony J.; Bellara, Aarti – Journal of Educational and Behavioral Statistics, 2018
This study compared five common multilevel software packages via Monte Carlo simulation: HLM 7, M"plus" 7.4, R (lme4 V1.1-12), Stata 14.1, and SAS 9.4 to determine how the programs differ in estimation accuracy and speed, as well as convergence, when modeling multiple randomly varying slopes of different magnitudes. Simulated data…
Descriptors: Hierarchical Linear Modeling, Computer Software, Comparative Analysis, Monte Carlo Methods
Sweet, Tracy M.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2016
The hierarchical network model (HNM) is a framework introduced by Sweet, Thomas, and Junker for modeling interventions and other covariate effects on ensembles of social networks, such as what would be found in randomized controlled trials in education research. In this article, we develop calculations for the power to detect an intervention…
Descriptors: Intervention, Social Networks, Statistical Analysis, Computation