Publication Date
In 2025 | 13 |
Since 2024 | 65 |
Since 2021 (last 5 years) | 209 |
Since 2016 (last 10 years) | 487 |
Since 2006 (last 20 years) | 1041 |
Descriptor
Source
Author
Kromrey, Jeffrey D. | 21 |
Fan, Xitao | 18 |
Barcikowski, Robert S. | 16 |
DeSarbo, Wayne S. | 14 |
Donoghue, John R. | 12 |
Ferron, John M. | 12 |
Finch, W. Holmes | 12 |
Zhang, Zhiyong | 11 |
Cohen, Allan S. | 10 |
Finch, Holmes | 10 |
Kim, Seock-Ho | 10 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 49 |
Practitioners | 22 |
Teachers | 19 |
Students | 4 |
Administrators | 2 |
Location
Germany | 10 |
Australia | 7 |
United Kingdom | 7 |
Canada | 6 |
Netherlands | 6 |
United States | 6 |
Belgium | 5 |
California | 5 |
Hong Kong | 5 |
South Korea | 5 |
Spain | 5 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 4 |
Pell Grant Program | 2 |
Aid to Families with… | 1 |
American Recovery and… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Tong, Xin; Zhang, Zhiyong – Grantee Submission, 2017
Growth curve models are widely used for investigating growth and change phenomena. Many studies in social and behavioral sciences have demonstrated that data without any outlying observation are rather an exception, especially for data collected longitudinally. Ignoring the existence of outlying observations may lead to inaccurate or even…
Descriptors: Observation, Models, Statistical Distributions, Monte Carlo Methods
Weiss, Brandi A.; Dardick, William – Educational and Psychological Measurement, 2016
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…
Descriptors: Regression (Statistics), Goodness of Fit, Models, Classification
Frermann, Lea; Lapata, Mirella – Cognitive Science, 2016
Models of category learning have been extensively studied in cognitive science and primarily tested on perceptual abstractions or artificial stimuli. In this paper, we focus on categories acquired from natural language stimuli, that is, words (e.g., "chair" is a member of the furniture category). We present a Bayesian model that, unlike…
Descriptors: Classification, Bayesian Statistics, Models, Cognitive Science
Man, Kaiwen; Harring, Jeffrey R. – Educational and Psychological Measurement, 2019
With the development of technology-enhanced learning platforms, eye-tracking biometric indicators can be recorded simultaneously with students item responses. In the current study, visual fixation, an essential eye-tracking indicator, is modeled to reflect the degree of test engagement when a test taker solves a set of test questions. Three…
Descriptors: Test Items, Eye Movements, Models, Regression (Statistics)
Bolin, Jocelyn H.; Finch, W. Holmes; Stenger, Rachel – Educational and Psychological Measurement, 2019
Multilevel data are a reality for many disciplines. Currently, although multiple options exist for the treatment of multilevel data, most disciplines strictly adhere to one method for multilevel data regardless of the specific research design circumstances. The purpose of this Monte Carlo simulation study is to compare several methods for the…
Descriptors: Hierarchical Linear Modeling, Computation, Statistical Analysis, Maximum Likelihood Statistics
Yasuda, Jun-ichiro; Mae, Naohiro; Hull, Michael M.; Taniguchi, Masa-aki – Physical Review Physics Education Research, 2021
As a method to shorten the test time of the Force Concept Inventory (FCI), we suggest the use of computerized adaptive testing (CAT). CAT is the process of administering a test on a computer, with items (i.e., questions) selected based upon the responses of the examinee to prior items. In so doing, the test length can be significantly shortened.…
Descriptors: Foreign Countries, College Students, Student Evaluation, Computer Assisted Testing
Beaujean, A. Alexander – Journal of Psychoeducational Assessment, 2018
Simulation studies use computer-generated data to examine questions of interest that have traditionally been used to study properties of statistics and estimating algorithms. With the recent advent of powerful processing capabilities in affordable computers along with readily usable software, it is now feasible to use a simulation study to aid in…
Descriptors: Computer Simulation, Computation, Learning Disabilities, Identification
Ames, Allison J.; Au, Chi Hang – Measurement: Interdisciplinary Research and Perspectives, 2018
Stan is a flexible probabilistic programming language providing full Bayesian inference through Hamiltonian Monte Carlo algorithms. The benefits of Hamiltonian Monte Carlo include improved efficiency and faster inference, when compared to other MCMC software implementations. Users can interface with Stan through a variety of computing…
Descriptors: Item Response Theory, Computer Software Evaluation, Computer Software, Programming Languages
McNeish, Daniel – Educational and Psychological Measurement, 2017
In behavioral sciences broadly, estimating growth models with Bayesian methods is becoming increasingly common, especially to combat small samples common with longitudinal data. Although Mplus is becoming an increasingly common program for applied research employing Bayesian methods, the limited selection of prior distributions for the elements of…
Descriptors: Models, Bayesian Statistics, Statistical Analysis, Computer Software
Qian, Jiahe – ETS Research Report Series, 2017
The variance formula derived for a two-stage sampling design without replacement employs the joint inclusion probabilities in the first-stage selection of clusters. One of the difficulties encountered in data analysis is the lack of information about such joint inclusion probabilities. One way to solve this issue is by applying Hájek's…
Descriptors: Mathematical Formulas, Computation, Sampling, Research Design
Pedersen, Ellen Raben; Juhl, Peter Møller – Journal of Speech, Language, and Hearing Research, 2017
Purpose: Critical differences state by how much 2 test results have to differ in order to be significantly different. Critical differences for discrimination scores have been available for several decades, but they do not exist for speech reception thresholds (SRTs). This study presents and discusses how critical differences for SRTs can be…
Descriptors: Speech, Simulation, Differences, Test Results
Dai, Shenghai; Svetina, Dubravka; Wang, Xiaolin – Journal of Educational and Behavioral Statistics, 2017
There is an increasing interest in reporting test subscores for diagnostic purposes. In this article, we review nine popular R packages (subscore, mirt, TAM, sirt, CDM, NPCD, lavaan, sem, and OpenMX) that are capable of implementing subscore-reporting methods within one or more frameworks including classical test theory, multidimensional item…
Descriptors: Diagnostic Tests, Scores, Computer Software, Item Response Theory
Reardon, Sean F.; Shear, Benjamin R.; Castellano, Katherine E.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2017
Test score distributions of schools or demographic groups are often summarized by frequencies of students scoring in a small number of ordered proficiency categories. We show that heteroskedastic ordered probit (HETOP) models can be used to estimate means and standard deviations of multiple groups' test score distributions from such data. Because…
Descriptors: Scores, Statistical Analysis, Models, Computation
Baghaei, Samira; Bagheri, Mohammad Sadegh; Yamini, Mortaza – Cogent Education, 2020
The main purpose of this quantitative-qualitative content analysis study was to compare IELTS and TOEFL listening and reading tests based on the representation of the learning objectives of Revised Bloom's taxonomy. To this end, 12 Academic IELTS listening and reading tests and 12 TOEFL iBT listening and reading tests were analyzed qualitatively…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Reading Tests
Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Descriptors: Item Response Theory, Test Reliability, Test Items, Scores