Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 20 |
| Since 2007 (last 20 years) | 56 |
Descriptor
| Computation | 62 |
| Test Length | 62 |
| Item Response Theory | 40 |
| Test Items | 29 |
| Sample Size | 26 |
| Accuracy | 20 |
| Simulation | 19 |
| Maximum Likelihood Statistics | 15 |
| Bayesian Statistics | 14 |
| Error of Measurement | 14 |
| Correlation | 12 |
| More ▼ | |
Source
Author
| Wang, Wen-Chung | 4 |
| Cheng, Ying | 3 |
| He, Wei | 2 |
| Kilic, Abdullah Faruk | 2 |
| Lathrop, Quinn N. | 2 |
| Lee, Yi-Hsuan | 2 |
| Liu, Chen-Wei | 2 |
| Paek, Insu | 2 |
| Zhang, Jinming | 2 |
| de la Torre, Jimmy | 2 |
| Atar, Burcu | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 52 |
| Reports - Research | 39 |
| Reports - Evaluative | 14 |
| Dissertations/Theses -… | 8 |
| Reports - Descriptive | 1 |
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Secondary Education | 1 |
| High Schools | 1 |
| Preschool Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
| National Longitudinal Study… | 1 |
| Program for International… | 1 |
| Trends in International… | 1 |
What Works Clearinghouse Rating
Fu, Qiong – ProQuest LLC, 2010
This research investigated how the accuracy of person ability and item difficulty parameter estimation varied across five IRT models with respect to the presence of guessing, targeting, and varied combinations of sample sizes and test lengths. The data were simulated with 50 replications under each of the 18 combined conditions. Five IRT models…
Descriptors: Item Response Theory, Guessing (Tests), Accuracy, Computation
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification
Oranje, Andreas; Li, Deping; Kandathil, Mathew – ETS Research Report Series, 2009
Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…
Descriptors: Error of Measurement, Computation, Regression (Statistics), National Competency Tests
Livingston, Samuel A.; Lewis, Charles – Educational Testing Service, 2009
This report proposes an empirical Bayes approach to the problem of equating scores on test forms taken by very small numbers of test takers. The equated score is estimated separately at each score point, making it unnecessary to model either the score distribution or the equating transformation. Prior information comes from equatings of other…
Descriptors: Test Length, Equated Scores, Bayesian Statistics, Sample Size
Finkelman, Matthew David – Applied Psychological Measurement, 2010
In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability…
Descriptors: Mastery Tests, Computer Assisted Testing, Adaptive Testing, Test Length
de la Torre, Jimmy; Song, Hao – Applied Psychological Measurement, 2009
Assessments consisting of different domains (e.g., content areas, objectives) are typically multidimensional in nature but are commonly assumed to be unidimensional for estimation purposes. The different domains of these assessments are further treated as multi-unidimensional tests for the purpose of obtaining diagnostic information. However, when…
Descriptors: Ability, Tests, Item Response Theory, Data Analysis
Finch, Holmes – Applied Psychological Measurement, 2010
The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…
Descriptors: Item Response Theory, Computation, Factor Analysis, Models
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008
The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability
Wei, Youhua – ProQuest LLC, 2008
Scale linking is the process of developing the connection between scales of two or more sets of parameter estimates obtained from separate test calibrations. It is the prerequisite for many applications of IRT, such as test equating and differential item functioning analysis. Unidimensional scale linking methods have been studied and applied…
Descriptors: Test Length, Test Items, Sample Size, Simulation
Woods, Carol M. – Applied Psychological Measurement, 2008
In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…
Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics
Peer reviewedLiou, Michelle – Applied Psychological Measurement, 1994
A recursive equation is proposed for computing higher order derivatives of elementary symmetric functions in the Rasch model. A simulation study indicates a small loss in accuracy for the proposed formula compared to Gustafsson's sum algorithm (1980) for computing higher order derivatives when tests contain 60 items or less. (SLD)
Descriptors: Algorithms, Computation, Item Response Theory, Simulation
Hendrawan, Irene; Glas, Cees A. W.; Meijer, Rob R. – Applied Psychological Measurement, 2005
The effect of person misfit to an item response theory model on a mastery/nonmastery decision was investigated. Furthermore, it was investigated whether the classification precision can be improved by identifying misfitting respondents using person-fit statistics. A simulation study was conducted to investigate the probability of a correct…
Descriptors: Probability, Statistics, Test Length, Simulation
Wang, Wen-Chung – Educational and Psychological Measurement, 2004
The Pearson correlation is used to depict effect sizes in the context of item response theory. Amultidimensional Rasch model is used to directly estimate the correlation between latent traits. Monte Carlo simulations were conducted to investigate whether the population correlation could be accurately estimated and whether the bootstrap method…
Descriptors: Test Length, Sampling, Effect Size, Correlation
Eggen, Theo J. H. M.; Verelst, Norman D. – Psychometrika, 2006
In this paper, the efficiency of conditional maximum likelihood (CML) and marginal maximum likelihood (MML) estimation of the item parameters of the Rasch model in incomplete designs is investigated. The use of the concept of F-information (Eggen, 2000) is generalized to incomplete testing designs. The scaled determinant of the F-information…
Descriptors: Test Length, Computation, Maximum Likelihood Statistics, Models

Direct link
