NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 2,789 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Okan Bulut; Doyoung Kim – Journal of Applied Testing Technology, 2023
The development of a Computerized Adaptive Test (CAT) for operational use begins with several important steps, such as creating a large-size item bank, piloting the items on a sizable and representative sample of examinees, dimensionality assessment of the item bank, and estimation of item parameters. Among these steps, testing the dimensionality…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Dongmei – Journal of Educational Measurement, 2022
Equating error is usually small relative to the magnitude of measurement error, but it could be one of the major sources of error contributing to mean scores of large groups in educational measurement, such as the year-to-year state mean score fluctuations. Though testing programs may routinely calculate the standard error of equating (SEE), the…
Descriptors: Error Patterns, Educational Testing, Group Testing, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
V. N. Vimal Rao; Jeffrey K. Bye; Sashank Varma – Cognitive Research: Principles and Implications, 2024
The 0.05 boundary within Null Hypothesis Statistical Testing (NHST) "has made a lot of people very angry and been widely regarded as a bad move" (to quote Douglas Adams). Here, we move past meta-scientific arguments and ask an empirical question: What is the psychological standing of the 0.05 boundary for statistical significance? We…
Descriptors: Psychological Patterns, Statistical Analysis, Testing, Statistical Significance
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
A. R. Georgeson – Structural Equation Modeling: A Multidisciplinary Journal, 2025
There is increasing interest in using factor scores in structural equation models and there have been numerous methodological papers on the topic. Nevertheless, sum scores, which are computed from adding up item responses, continue to be ubiquitous in practice. It is therefore important to compare simulation results involving factor scores to…
Descriptors: Structural Equation Models, Scores, Factor Analysis, Statistical Bias
Chenchen Ma; Gongjun Xu – Grantee Submission, 2022
Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models widely used in educational, psychological and social sciences. In many applications of CDMs, certain hierarchical structures among the latent attributes are assumed by researchers to characterize their dependence structure. Specifically, a directed acyclic…
Descriptors: Vertical Organization, Models, Evaluation, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Levin, Joel R.; Ferron, John M.; Gafurov, Boris S. – Educational Psychology Review, 2021
Previous simulation studies of randomization tests applied in single-case educational intervention research contexts have typically focused on A-to-B phase changes in means/levels. In the present simulation study, we report the results of two multiple-baseline investigations, one targeting between-phase changes in slopes/trends and the other…
Descriptors: Educational Research, Statistical Analysis, Hypothesis Testing, Intervention
Peer reviewed Peer reviewed
Direct linkDirect link
Annabel L. Davies; A. E. Ades; Julian P. T. Higgins – Research Synthesis Methods, 2024
Quantitative evidence synthesis methods aim to combine data from multiple medical trials to infer relative effects of different interventions. A challenge arises when trials report continuous outcomes on different measurement scales. To include all evidence in one coherent analysis, we require methods to "map" the outcomes onto a single…
Descriptors: Children, Body Composition, Measurement Techniques, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Sun-Joo Cho; Goodwin Amanda; Jorge Salas; Sophia Mueller – Grantee Submission, 2025
This study incorporates a random forest (RF) approach to probe complex interactions and nonlinearity among predictors into an item response model with the goal of using a hybrid approach to outperform either an RF or explanatory item response model (EIRM) only in explaining item responses. In the specified model, called EIRM-RF, predicted values…
Descriptors: Item Response Theory, Artificial Intelligence, Statistical Analysis, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, W. Holmes – Journal of Experimental Education, 2022
Multivariate analysis of variance (MANOVA) is widely used to test the null hypothesis of equal multivariate means across 2 or more groups. MANOVA rests upon an assumption that error terms are independent of one another, which can be violated if individuals are clustered or nested within groups, such as schools. Ignoring such nesting can result in…
Descriptors: Multivariate Analysis, Hypothesis Testing, Structural Equation Models, Hierarchical Linear Modeling
Peer reviewed Peer reviewed
Direct linkDirect link
Guastadisegni, Lucia; Cagnone, Silvia; Moustaki, Irini; Vasdekis, Vassilis – Educational and Psychological Measurement, 2022
This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach,…
Descriptors: Measurement, Statistical Analysis, Item Response Theory, Test Items
Peng, Luyao; Sinharay, Sandip – Educational and Psychological Measurement, 2022
Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of…
Descriptors: Cheating, Identification, Statistical Analysis, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Rebeckah K. Fussell; Emily M. Stump; N. G. Holmes – Physical Review Physics Education Research, 2024
Physics education researchers are interested in using the tools of machine learning and natural language processing to make quantitative claims from natural language and text data, such as open-ended responses to survey questions. The aspiration is that this form of machine coding may be more efficient and consistent than human coding, allowing…
Descriptors: Physics, Educational Researchers, Artificial Intelligence, Natural Language Processing
Peer reviewed Peer reviewed
Direct linkDirect link
Vembye, Mikkel Helding; Pustejovsky, James Eric; Pigott, Therese Deocampo – Journal of Educational and Behavioral Statistics, 2023
Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we…
Descriptors: Meta Analysis, Robustness (Statistics), Statistical Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022
The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…
Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  186