NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Journal of Educational and…20
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 20 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022
One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…
Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Kuijpers, Renske E.; Visser, Ingmar; Molenaar, Dylan – Journal of Educational and Behavioral Statistics, 2021
Mixture models have been developed to enable detection of within-subject differences in responses and response times to psychometric test items. To enable mixture modeling of both responses and response times, a distributional assumption is needed for the within-state response time distribution. Since violations of the assumed response time…
Descriptors: Test Items, Responses, Reaction Time, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019
When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…
Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations
Peer reviewed Peer reviewed
Direct linkDirect link
Suk, Youmi; Kim, Jee-Seon; Kang, Hyunseung – Journal of Educational and Behavioral Statistics, 2021
There has been increasing interest in exploring heterogeneous treatment effects using machine learning (ML) methods such as causal forests, Bayesian additive regression trees, and targeted maximum likelihood estimation. However, there is little work on applying these methods to estimate treatment effects in latent classes defined by…
Descriptors: Artificial Intelligence, Statistical Analysis, Statistical Inference, Classification
Sales, Adam C.; Hansen, Ben B. – Journal of Educational and Behavioral Statistics, 2020
Conventionally, regression discontinuity analysis contrasts a univariate regression's limits as its independent variable, "R," approaches a cut point, "c," from either side. Alternative methods target the average treatment effect in a small region around "c," at the cost of an assumption that treatment assignment,…
Descriptors: Regression (Statistics), Computation, Statistical Inference, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Casabianca, Jodi M.; Lewis, Charles – Journal of Educational and Behavioral Statistics, 2018
The null hypothesis test used in differential item functioning (DIF) detection tests for a subgroup difference in item-level performance--if the null hypothesis of "no DIF" is rejected, the item is flagged for DIF. Conversely, an item is kept in the test form if there is insufficient evidence of DIF. We present frequentist and empirical…
Descriptors: Test Bias, Hypothesis Testing, Bayesian Statistics, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2018
Multiple imputation (MI) can be used to address missing data at Level 2 in multilevel research. In this article, we compare joint modeling (JM) and the fully conditional specification (FCS) of MI as well as different strategies for including auxiliary variables at Level 1 using either their manifest or their latent cluster means. We show with…
Descriptors: Statistical Analysis, Data, Comparative Analysis, Hierarchical Linear Modeling
Peer reviewed Peer reviewed
Direct linkDirect link
Leckie, George – Journal of Educational and Behavioral Statistics, 2018
The traditional approach to estimating the consistency of school effects across subject areas and the stability of school effects across time is to fit separate value-added multilevel models to each subject or cohort and to correlate the resulting empirical Bayes predictions. We show that this gives biased correlations and these biases cannot be…
Descriptors: Value Added Models, Reliability, Statistical Bias, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Romero, Mauricio; Riascos, Álvaro; Jara, Diego – Journal of Educational and Behavioral Statistics, 2015
Multiple-choice exams are frequently used as an efficient and objective method to assess learning, but they are more vulnerable to answer copying than tests based on open questions. Several statistical tests (known as indices in the literature) have been proposed to detect cheating; however, to the best of our knowledge, they all lack mathematical…
Descriptors: Cheating, Multiple Choice Tests, Statistical Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Bolsinova, Maria; Tijmstra, Jesper – Journal of Educational and Behavioral Statistics, 2016
Conditional independence (CI) between response time and response accuracy is a fundamental assumption of many joint models for time and accuracy used in educational measurement. In this study, posterior predictive checks (PPCs) are proposed for testing this assumption. These PPCs are based on three discrepancy measures reflecting different…
Descriptors: Reaction Time, Accuracy, Statistical Analysis, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Drechsler, Jörg – Journal of Educational and Behavioral Statistics, 2015
Multiple imputation is widely accepted as the method of choice to address item-nonresponse in surveys. However, research on imputation strategies for the hierarchical structures that are typically found in the data in educational contexts is still limited. While a multilevel imputation model should be preferred from a theoretical point of view if…
Descriptors: Hierarchical Linear Modeling, Statistical Analysis, Educational Research, Statistical Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Leckie, George; Pillinger, Rebecca; Jones, Kelvyn; Goldstein, Harvey – Journal of Educational and Behavioral Statistics, 2012
The traditional approach to measuring segregation is based upon descriptive, non-model-based indices. A recently proposed alternative is multilevel modeling. The authors further develop the argument for a multilevel modeling approach by first describing and expanding upon its notable advantages, which include an ability to model segregation at a…
Descriptors: Statistical Analysis, Models, Simulation, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Magis, David; Raiche, Gilles; Beland, Sebastien – Journal of Educational and Behavioral Statistics, 2012
This paper focuses on two likelihood-based indices of person fit, the index "l[subscript z]" and the Snijders's modified index "l[subscript z]*". The first one is commonly used in practical assessment of person fit, although its asymptotic standard normal distribution is not valid when true abilities are replaced by sample…
Descriptors: Goodness of Fit, Item Response Theory, Computation, Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Strobl, Carolin; Wickelmaier, Florian; Zeileis, Achim – Journal of Educational and Behavioral Statistics, 2011
The preference scaling of a group of subjects may not be homogeneous, but different groups of subjects with certain characteristics may show different preference scalings, each of which can be derived from paired comparisons by means of the Bradley-Terry model. Usually, either different models are fit in predefined subsets of the sample or the…
Descriptors: Individual Differences, Scaling, Statistical Analysis, Models
Previous Page | Next Page »
Pages: 1  |  2