Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 22 |
| Since 2007 (last 20 years) | 35 |
Descriptor
| Monte Carlo Methods | 52 |
| Test Length | 52 |
| Item Response Theory | 37 |
| Test Items | 28 |
| Sample Size | 27 |
| Error of Measurement | 15 |
| Accuracy | 13 |
| Comparative Analysis | 13 |
| Computation | 12 |
| Item Analysis | 11 |
| Models | 11 |
| More ▼ | |
Source
Author
Publication Type
| Journal Articles | 40 |
| Reports - Research | 33 |
| Reports - Evaluative | 14 |
| Speeches/Meeting Papers | 5 |
| Dissertations/Theses -… | 4 |
| Numerical/Quantitative Data | 1 |
| Reports - Descriptive | 1 |
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Elementary Education | 1 |
Audience
| Researchers | 1 |
Location
| Japan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
de la Torre, Jimmy; Song, Hao – Applied Psychological Measurement, 2009
Assessments consisting of different domains (e.g., content areas, objectives) are typically multidimensional in nature but are commonly assumed to be unidimensional for estimation purposes. The different domains of these assessments are further treated as multi-unidimensional tests for the purpose of obtaining diagnostic information. However, when…
Descriptors: Ability, Tests, Item Response Theory, Data Analysis
Klockars, Alan J.; Lee, Yoonsun – Journal of Educational Measurement, 2008
Monte Carlo simulations with 20,000 replications are reported to estimate the probability of rejecting the null hypothesis regarding DIF using SIBTEST when there is DIF present and/or when impact is present due to differences on the primary dimension to be measured. Sample sizes are varied from 250 to 2000 and test lengths from 10 to 40 items.…
Descriptors: Test Bias, Test Length, Reference Groups, Probability
Finch, Holmes – Applied Psychological Measurement, 2010
The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…
Descriptors: Item Response Theory, Computation, Factor Analysis, Models
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics
Monahan, Patrick O.; Stump, Timothy E.; Finch, Holmes; Hambleton, Ronald K. – Applied Psychological Measurement, 2007
DETECT is a nonparametric "full" dimensionality assessment procedure that clusters dichotomously scored items into dimensions and provides a DETECT index of magnitude of multidimensionality. Four factors (test length, sample size, item response theory [IRT] model, and DETECT index) were manipulated in a Monte Carlo study of bias, standard error,…
Descriptors: Test Length, Sample Size, Monte Carlo Methods, Geometric Concepts
Abdel-fattah, Abdel-fattah A. – 1994
The accuracy of estimation procedures in item response theory was studied using Monte Carlo methods and varying sample size, number of subjects, and distribution of ability parameters for: (1) joint maximum likelihood as implemented in the computer program LOGIST; (2) marginal maximum likelihood; and (3) marginal Bayesian procedures as implemented…
Descriptors: Ability, Bayesian Statistics, Estimation (Mathematics), Maximum Likelihood Statistics
Peer reviewedCudeck, Robert; And Others – Applied Psychological Measurement, 1979
TAILOR, a computer program which implements an approach to tailored testing, was examined by Monte Carlo methods. The evaluation showed the procedure to be highly reliable and capable of reducing the required number of tests items by about one half. (Author/JKS)
Descriptors: Adaptive Testing, Computer Programs, Feasibility Studies, Item Analysis
Peer reviewedStark, Stephen; Drasgow, Fritz – Applied Psychological Measurement, 2002
Describes item response and information functions for the Zinnes and Griggs paired comparison item response theory (IRT) model (1974) and presents procedures for estimating stimulus and person parameters. Monte Carlo simulations show that at least 400 ratings are required to obtain reasonably accurate estimates of the stimulus parameters and their…
Descriptors: Comparative Analysis, Computer Simulation, Error of Measurement, Item Response Theory
Peer reviewedNoonan, Brian W.; And Others – Applied Psychological Measurement, 1992
Studied the extent to which three appropriateness indexes, Z(sub 3), ECIZ4, and W, are well standardized in a Monte Carlo study. The ECIZ4 most closely approximated a normal distribution, and its skewness and kurtosis were more stable and less affected by test length and item response theory model than the others. (SLD)
Descriptors: Comparative Analysis, Item Response Theory, Mathematical Models, Maximum Likelihood Statistics
Peer reviewedStone, Clement A. – Applied Psychological Measurement, 1992
Monte Carlo methods are used to evaluate marginal maximum likelihood estimation of item parameters and maximum likelihood estimates of theta in the two-parameter logistic model for varying test lengths, sample sizes, and assumed theta distributions. Results with 100 datasets demonstrate the methods' general precision and stability. Exceptions are…
Descriptors: Computer Software Evaluation, Estimation (Mathematics), Mathematical Models, Maximum Likelihood Statistics
Peer reviewedReise, Steven P.; Due, Allan M. – Applied Psychological Measurement, 1991
Previous person-fit research is extended through explication of an unexplored model for generating aberrant response patterns. The proposed model is then implemented to investigate the influence of test properties on the aberrancy detection power of a person-fit statistic. Difficulties of aberrancy detection are discussed. (SLD)
Descriptors: Algorithms, Computer Simulation, Item Response Theory, Mathematical Models
Peer reviewedVan Der Linden, Wim J. – Educational and Psychological Measurement, 1983
This paper focuses on mixtures of two binomials with one known success parameter. It is shown how moment estimators can be obtained for the remaining unknown parameters of such mixtures, and results are presented from a Monte Carlo study carried out to explore the statistical properties of these estimators. (PN)
Descriptors: Educational Testing, Error of Measurement, Estimation (Mathematics), Guessing (Tests)
de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S. – Applied Psychological Measurement, 2006
The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…
Descriptors: Computation, Monte Carlo Methods, Markov Processes, Item Response Theory
PDF pending restorationGilmer, Jerry S.; Feldt, Leonard S. – 1982
The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…
Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models
Ankenmann, Robert D.; Stone, Clement A. – 1992
Effects of test length, sample size, and assumed ability distribution were investigated in a multiple replication Monte Carlo study under the 1-parameter (1P) and 2-parameter (2P) logistic graded model with five score levels. Accuracy and variability of item parameter and ability estimates were examined. Monte Carlo methods were used to evaluate…
Descriptors: Computer Simulation, Estimation (Mathematics), Item Bias, Mathematical Models

Direct link
