Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 27 |
| Since 2007 (last 20 years) | 50 |
Descriptor
| Error of Measurement | 78 |
| Test Length | 78 |
| Test Items | 41 |
| Item Response Theory | 36 |
| Sample Size | 30 |
| Test Reliability | 20 |
| Models | 18 |
| Comparative Analysis | 17 |
| Simulation | 17 |
| Scores | 16 |
| Monte Carlo Methods | 15 |
| More ▼ | |
Source
Author
| Sijtsma, Klaas | 3 |
| Wang, Wen-Chung | 3 |
| DeMars, Christine E. | 2 |
| Emons, Wilco H. M. | 2 |
| Finch, Holmes | 2 |
| Gu, Lixiong | 2 |
| Kilic, Abdullah Faruk | 2 |
| Lee, Won-Chan | 2 |
| Lee, Yi-Hsuan | 2 |
| Livingston, Samuel A. | 2 |
| Stark, Stephen | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 58 |
| Reports - Research | 53 |
| Reports - Evaluative | 16 |
| Dissertations/Theses -… | 4 |
| Speeches/Meeting Papers | 4 |
| Reports - Descriptive | 2 |
Education Level
| Grade 3 | 2 |
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| High Schools | 1 |
| Primary Education | 1 |
Audience
| Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Finch, Holmes – Applied Psychological Measurement, 2010
The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…
Descriptors: Item Response Theory, Computation, Factor Analysis, Models
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008
The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability
Peer reviewedCureton, Edward E.; And Others – Educational and Psychological Measurement, 1973
Study based on F. M. Lord's arguments in 1957 and 1959 that tests of the same length do have the same standard error of measurement. (CB)
Descriptors: Error of Measurement, Statistical Analysis, Test Interpretation, Test Length
Peer reviewedAllison, Paul A. – Psychometrika, 1976
A direct proof is given for the generalized Spearman-Brown formula for any real multiple of test length. (Author)
Descriptors: Correlation, Error of Measurement, Raw Scores, Test Length
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
DeMars, Christine E. – Educational and Psychological Measurement, 2005
Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…
Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Livingston, Samuel A.; Lewis, Charles – 1993
This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…
Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Livingston, Samuel A. – 1981
The standard error of measurement (SEM) is a measure of the inconsistency in the scores of a particular group of test-takers. It is largest for test-takers with scores ranging in the 50 percent correct bracket; with nearly perfect scores, it is smaller. On tests used to make pass/fail decisions, the test-takers' scores tend to cluster in the range…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Formulas, Pass Fail Grading
Peer reviewedAxelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Peer reviewedStark, Stephen; Drasgow, Fritz – Applied Psychological Measurement, 2002
Describes item response and information functions for the Zinnes and Griggs paired comparison item response theory (IRT) model (1974) and presents procedures for estimating stimulus and person parameters. Monte Carlo simulations show that at least 400 ratings are required to obtain reasonably accurate estimates of the stimulus parameters and their…
Descriptors: Comparative Analysis, Computer Simulation, Error of Measurement, Item Response Theory
Peer reviewedWoodruff, David – Journal of Educational Measurement, 1991
Improvements are made on previous estimates for the conditional standard error of measurement in prediction, the conditional standard error of estimation (CSEE), and the conditional standard error of prediction (CSEP). Better estimates of how test length affects CSEE and CSEP are derived. (SLD)
Descriptors: Equations (Mathematics), Error of Measurement, Estimation (Mathematics), Mathematical Models
Yi, Qing; Wang, Tianyou; Ban, Jae-Chun – 2000
Error indices (bias, standard error of estimation, and root mean square error) obtained on different scales of measurement under different test termination rules in a computerized adaptive test (CAT) context were examined. Four ability estimation methods were studied: (1) maximum likelihood estimation (MLE); (2) weighted likelihood estimation…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Error of Measurement

Direct link
