NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 46 to 60 of 78 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes – Applied Psychological Measurement, 2010
The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…
Descriptors: Item Response Theory, Computation, Factor Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008
The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability
Peer reviewed Peer reviewed
Cureton, Edward E.; And Others – Educational and Psychological Measurement, 1973
Study based on F. M. Lord's arguments in 1957 and 1959 that tests of the same length do have the same standard error of measurement. (CB)
Descriptors: Error of Measurement, Statistical Analysis, Test Interpretation, Test Length
Peer reviewed Peer reviewed
Allison, Paul A. – Psychometrika, 1976
A direct proof is given for the generalized Spearman-Brown formula for any real multiple of test length. (Author)
Descriptors: Correlation, Error of Measurement, Raw Scores, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E. – Educational and Psychological Measurement, 2005
Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…
Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Livingston, Samuel A.; Lewis, Charles – 1993
This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…
Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Livingston, Samuel A. – 1981
The standard error of measurement (SEM) is a measure of the inconsistency in the scores of a particular group of test-takers. It is largest for test-takers with scores ranging in the 50 percent correct bracket; with nearly perfect scores, it is smaller. On tests used to make pass/fail decisions, the test-takers' scores tend to cluster in the range…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Formulas, Pass Fail Grading
Peer reviewed Peer reviewed
Axelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Peer reviewed Peer reviewed
Stark, Stephen; Drasgow, Fritz – Applied Psychological Measurement, 2002
Describes item response and information functions for the Zinnes and Griggs paired comparison item response theory (IRT) model (1974) and presents procedures for estimating stimulus and person parameters. Monte Carlo simulations show that at least 400 ratings are required to obtain reasonably accurate estimates of the stimulus parameters and their…
Descriptors: Comparative Analysis, Computer Simulation, Error of Measurement, Item Response Theory
Peer reviewed Peer reviewed
Woodruff, David – Journal of Educational Measurement, 1991
Improvements are made on previous estimates for the conditional standard error of measurement in prediction, the conditional standard error of estimation (CSEE), and the conditional standard error of prediction (CSEP). Better estimates of how test length affects CSEE and CSEP are derived. (SLD)
Descriptors: Equations (Mathematics), Error of Measurement, Estimation (Mathematics), Mathematical Models
Yi, Qing; Wang, Tianyou; Ban, Jae-Chun – 2000
Error indices (bias, standard error of estimation, and root mean square error) obtained on different scales of measurement under different test termination rules in a computerized adaptive test (CAT) context were examined. Four ability estimation methods were studied: (1) maximum likelihood estimation (MLE); (2) weighted likelihood estimation…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Error of Measurement
Pages: 1  |  2  |  3  |  4  |  5  |  6