Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 20 |
| Since 2007 (last 20 years) | 56 |
Descriptor
| Computation | 62 |
| Test Length | 62 |
| Item Response Theory | 40 |
| Test Items | 29 |
| Sample Size | 26 |
| Accuracy | 20 |
| Simulation | 19 |
| Maximum Likelihood Statistics | 15 |
| Bayesian Statistics | 14 |
| Error of Measurement | 14 |
| Correlation | 12 |
| More ▼ | |
Source
Author
| Wang, Wen-Chung | 4 |
| Cheng, Ying | 3 |
| He, Wei | 2 |
| Kilic, Abdullah Faruk | 2 |
| Lathrop, Quinn N. | 2 |
| Lee, Yi-Hsuan | 2 |
| Liu, Chen-Wei | 2 |
| Paek, Insu | 2 |
| Zhang, Jinming | 2 |
| de la Torre, Jimmy | 2 |
| Atar, Burcu | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 52 |
| Reports - Research | 39 |
| Reports - Evaluative | 14 |
| Dissertations/Theses -… | 8 |
| Reports - Descriptive | 1 |
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Secondary Education | 1 |
| High Schools | 1 |
| Preschool Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
| National Longitudinal Study… | 1 |
| Program for International… | 1 |
| Trends in International… | 1 |
What Works Clearinghouse Rating
Doebler, Anna; Doebler, Philipp; Holling, Heinz – Psychometrika, 2013
The common way to calculate confidence intervals for item response theory models is to assume that the standardized maximum likelihood estimator for the person parameter [theta] is normally distributed. However, this approximation is often inadequate for short and medium test lengths. As a result, the coverage probabilities fall below the given…
Descriptors: Foreign Countries, Item Response Theory, Computation, Hypothesis Testing
He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014
For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…
Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing
Walker, Cindy M.; Zhang, Bo; Banks, Kathleen; Cappaert, Kevin – Educational and Psychological Measurement, 2012
The purpose of this simulation study was to establish general effect size guidelines for interpreting the results of differential bundle functioning (DBF) analyses using simultaneous item bias test (SIBTEST). Three factors were manipulated: number of items in a bundle, test length, and magnitude of uniform differential item functioning (DIF)…
Descriptors: Test Bias, Test Length, Simulation, Guidelines
Magis, David; Beland, Sebastien; Raiche, Gilles – Applied Psychological Measurement, 2011
In this study, the estimation of extremely large or extremely small proficiency levels, given the item parameters of a logistic item response model, is investigated. On one hand, the estimation of proficiency levels by maximum likelihood (ML), despite being asymptotically unbiased, may yield infinite estimates. On the other hand, with an…
Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics
Md Desa, Zairul Nor Deana – ProQuest LLC, 2012
In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Descriptors: Item Response Theory, Computation, Reliability, Classification
Culpepper, Steven Andrew – Applied Psychological Measurement, 2012
Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…
Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)
He, Wei; Wolfe, Edward W. – Educational and Psychological Measurement, 2012
In administration of individually administered intelligence tests, items are commonly presented in a sequence of increasing difficulty, and test administration is terminated after a predetermined number of incorrect answers. This practice produces stochastically censored data, a form of nonignorable missing data. By manipulating four factors…
Descriptors: Individual Testing, Intelligence Tests, Test Items, Test Length
Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Descriptors: Item Response Theory, Classification, Accuracy, Reliability
Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei – Applied Psychological Measurement, 2011
DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…
Descriptors: Nonparametric Statistics, Statistical Analysis, Tests, Data
Roberts, James S.; Thompson, Vanessa M. – Applied Psychological Measurement, 2011
A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…
Descriptors: Statistical Analysis, Markov Processes, Computation, Monte Carlo Methods
Seo, Minhee; Roussos, Louis A. – Journal of Educational Measurement, 2010
DIMTEST is a widely used and studied method for testing the hypothesis of test unidimensionality as represented by local item independence. However, DIMTEST does not report the amount of multidimensionality that exists in data when rejecting its null. To provide more information regarding the degree to which data depart from unidimensionality, a…
Descriptors: Effect Size, Statistical Bias, Computation, Test Length
Kieftenbeld, Vincent; Natesan, Prathiba – Applied Psychological Measurement, 2012
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Descriptors: Test Length, Markov Processes, Item Response Theory, Monte Carlo Methods
Paek, Insu; Wilson, Mark – Educational and Psychological Measurement, 2011
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Descriptors: Test Bias, Test Length, Statistical Inference, Geometric Concepts
Wang, Wen-Chung; Liu, Chen-Wei – Educational and Psychological Measurement, 2011
The generalized graded unfolding model (GGUM) has been recently developed to describe item responses to Likert items (agree-disagree) in attitude measurement. In this study, the authors (a) developed two item selection methods in computerized classification testing under the GGUM, the current estimate/ability confidence interval method and the cut…
Descriptors: Computer Assisted Testing, Adaptive Testing, Classification, Item Response Theory
Foley, Brett Patrick – ProQuest LLC, 2010
The 3PL model is a flexible and widely used tool in assessment. However, it suffers from limitations due to its need for large sample sizes. This study introduces and evaluates the efficacy of a new sample size augmentation technique called Duplicate, Erase, and Replace (DupER) Augmentation through a simulation study. Data are augmented using…
Descriptors: Test Length, Sample Size, Simulation, Item Response Theory

Peer reviewed
Direct link
