Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 0 |
| Since 2007 (last 20 years) | 15 |
Descriptor
| Comparative Analysis | 15 |
| Computation | 15 |
| Item Response Theory | 10 |
| Simulation | 9 |
| Measurement | 4 |
| Models | 4 |
| Test Items | 4 |
| Error of Measurement | 3 |
| Monte Carlo Methods | 3 |
| Nonparametric Statistics | 3 |
| Scores | 3 |
| More ▼ | |
Source
| Applied Psychological… | 15 |
Author
Publication Type
| Journal Articles | 15 |
| Reports - Research | 9 |
| Reports - Evaluative | 5 |
| Reports - Descriptive | 1 |
Education Level
| Higher Education | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Grade 2 | 1 |
| High Schools | 1 |
| Postsecondary Education | 1 |
| Primary Education | 1 |
| Secondary Education | 1 |
Audience
Location
| Taiwan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Padilla, Miguel A.; Divers, Jasmin; Newton, Matthew – Applied Psychological Measurement, 2012
Three different bootstrap methods for estimating confidence intervals (CIs) for coefficient alpha were investigated. In addition, the bootstrap methods were compared with the most promising coefficient alpha CI estimation methods reported in the literature. The CI methods were assessed through a Monte Carlo simulation utilizing conditions…
Descriptors: Intervals, Monte Carlo Methods, Computation, Sampling
Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei – Applied Psychological Measurement, 2011
DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…
Descriptors: Nonparametric Statistics, Statistical Analysis, Tests, Data
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Yen, Yung-Chin; Ho, Rong-Guey; Laio, Wen-Wei; Chen, Li-Ju; Kuo, Ching-Chin – Applied Psychological Measurement, 2012
In a selected response test, aberrant responses such as careless errors and lucky guesses might cause error in ability estimation because these responses do not actually reflect the knowledge that examinees possess. In a computerized adaptive test (CAT), these aberrant responses could further cause serious estimation error due to dynamic item…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Response Style (Tests)
Kieftenbeld, Vincent; Natesan, Prathiba – Applied Psychological Measurement, 2012
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Descriptors: Test Length, Markov Processes, Item Response Theory, Monte Carlo Methods
Finkelman, Matthew D.; Kim, Wonsuk; Roussos, Louis; Verschoor, Angela – Applied Psychological Measurement, 2010
Automated test assembly (ATA) has been an area of prolific psychometric research. Although ATA methodology is well developed for unidimensional models, its application alongside cognitive diagnosis models (CDMs) is a burgeoning topic. Two suggested procedures for combining ATA and CDMs are to maximize the cognitive diagnostic index and to use a…
Descriptors: Automation, Test Construction, Programming, Models
Kim, Seonghoon – Applied Psychological Measurement, 2010
The three types (generalized, unweighted, and weighted) of least squares methods, proposed by Ogasawara, for estimating item response theory (IRT) linking coefficients under dichotomous models are extended to the graded response model. A simulation study was conducted to confirm the accuracy of the extended formulas, and a real data study was…
Descriptors: Least Squares Statistics, Computation, Item Response Theory, Models
Wang, Tianyou; Brennan, Robert L. – Applied Psychological Measurement, 2009
Frequency estimation, also called poststratification, is an equating method used under the common-item nonequivalent groups design. A modified frequency estimation method is proposed here, based on altering one of the traditional assumptions in frequency estimation in order to correct for equating bias. A simulation study was carried out to…
Descriptors: Computation, Bias, Comparative Analysis, Statistical Analysis
Belov, Dmitry I. – Applied Psychological Measurement, 2011
This article presents the Variable Match Index (VM-Index), a new statistic for detecting answer copying. The power of the VM-Index relies on two-dimensional conditioning as well as the structure of the test. The asymptotic distribution of the VM-Index is analyzed by reduction to Poisson trials. A computational study comparing the VM-Index with the…
Descriptors: Cheating, Journal Articles, Computation, Comparative Analysis
Woods, Carol M.; Lin, Nan – Applied Psychological Measurement, 2009
Davidian-curve item response theory (DC-IRT) is introduced, evaluated with simulations, and illustrated using data from the Schedule for Nonadaptive and Adaptive Personality Entitlement scale. DC-IRT is a method for fitting unidimensional IRT models with maximum marginal likelihood estimation, in which the latent density is estimated,…
Descriptors: Item Response Theory, Personality Measures, Computation, Simulation
Woods, Carol M. – Applied Psychological Measurement, 2008
Differential item functioning (DIF) occurs when an item has different measurement properties for members of one group versus another. Likelihood-ratio (LR) tests for DIF based on item response theory (IRT) involve statistically comparing IRT models that vary with respect to their constraints. A simulation study evaluated how violation of the…
Descriptors: Simulation, Item Response Theory, Comparative Analysis, Statistics
Penfield, Randall D. – Applied Psychological Measurement, 2008
The examination of measurement invariance in polytomous items is complicated by the possibility that the magnitude and sign of lack of invariance may vary across the steps underlying the set of polytomous response options, a concept referred to as differential step functioning (DSF). This article describes three classes of nonparametric DSF effect…
Descriptors: Simulation, Nonparametric Statistics, Item Response Theory, Computation
Abad, Francisco J.; Olea, Julio; Ponsoda, Vicente – Applied Psychological Measurement, 2009
This article deals with some of the problems that have hindered the application of Samejima's and Thissen and Steinberg's multiple-choice models: (a) parameter estimation difficulties owing to the large number of parameters involved, (b) parameter identifiability problems in the Thissen and Steinberg model, and (c) their treatment of omitted…
Descriptors: Multiple Choice Tests, Models, Computation, Simulation
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation

Peer reviewed
Direct link
