Descriptor
| Estimation (Mathematics) | 39 |
| Test Length | 39 |
| Item Response Theory | 13 |
| Test Items | 13 |
| Sample Size | 11 |
| Ability | 10 |
| Adaptive Testing | 10 |
| Error of Measurement | 10 |
| Mathematical Models | 10 |
| Computer Simulation | 9 |
| Bayesian Statistics | 8 |
| More ▼ | |
Source
| Applied Measurement in… | 3 |
| Applied Psychological… | 3 |
| Educational and Psychological… | 3 |
| Journal of Educational… | 2 |
| Psychological Assessment | 1 |
| Psychometrika | 1 |
Author
Publication Type
| Reports - Evaluative | 25 |
| Reports - Research | 14 |
| Journal Articles | 13 |
| Speeches/Meeting Papers | 13 |
| Numerical/Quantitative Data | 1 |
Education Level
Audience
| Researchers | 3 |
Location
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 3 |
| COMPASS (Computer Assisted… | 1 |
| Iowa Tests of Basic Skills | 1 |
| Medical College Admission Test | 1 |
| SAT (College Admission Test) | 1 |
| Wechsler Adult Intelligence… | 1 |
What Works Clearinghouse Rating
Peer reviewedFeldt, Leonard S. – Applied Measurement in Education, 2002
Considers the degree of bias in testlet-based alpha (internal consistency reliability) through hypothetical examples and real test data from four tests of the Iowa Tests of Basic Skills. Presents a simple formula for computing a testlet-based congeneric coefficient. (SLD)
Descriptors: Estimation (Mathematics), Reliability, Statistical Bias, Test Format
Peer reviewedDe Ayala, R. J. – Applied Psychological Measurement, 1994
Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size
Peer reviewedLivingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995
A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)
Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions
Peer reviewedQualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Livingston, Samuel A.; Lewis, Charles – 1993
This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…
Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Livingston, Samuel A. – 1981
The standard error of measurement (SEM) is a measure of the inconsistency in the scores of a particular group of test-takers. It is largest for test-takers with scores ranging in the 50 percent correct bracket; with nearly perfect scores, it is smaller. On tests used to make pass/fail decisions, the test-takers' scores tend to cluster in the range…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Formulas, Pass Fail Grading
Patsula, Liane N.; Gessaroli, Marc E. – 1995
Among the most popular techniques used to estimate item response theory (IRT) parameters are those used in the LOGIST and BILOG computer programs. Because of its accuracy with smaller sample sizes or differing test lengths, BILOG has become the standard to which new estimation programs are compared. However, BILOG is still complex and…
Descriptors: Comparative Analysis, Effect Size, Estimation (Mathematics), Item Response Theory
Peer reviewedAxelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Yamamoto, Kentaro – 1995
The traditional indicator of test speededness, missing responses, clearly indicates a lack of time to respond (thereby indicating the speededness of the test), but it is inadequate for evaluating speededness in a multiple-choice test scored as number correct, and it underestimates test speededness. Conventional item response theory (IRT) parameter…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Multiple Choice Tests
Peer reviewedKim, Seock-Ho; And Others – Psychometrika, 1994
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters through two joint and two marginal Bayesian procedures. Marginal procedures yielded smaller root mean square differences for item and ability, but results for larger sample size and test length were similar.…
Descriptors: Ability, Bayesian Statistics, Computer Simulation, Estimation (Mathematics)
Peer reviewedWang, Tianyou; Hanson, Bradley A.; Lau, Che-Ming A. – Applied Psychological Measurement, 1999
Extended the use of a beta prior in trait estimation to the maximum expected a posteriori (MAP) method of Bayesian estimation. This new method, essentially unbiased MAP, was compared with MAP, essentially unbiased expected a posteriori, weighted likelihood, and maximum-likelihood estimation methods. The new method significantly reduced bias in…
Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Estimation (Mathematics)
Abdel-fattah, Abdel-fattah A. – 1994
The accuracy of estimation procedures in item response theory was studied using Monte Carlo methods and varying sample size, number of subjects, and distribution of ability parameters for: (1) joint maximum likelihood as implemented in the computer program LOGIST; (2) marginal maximum likelihood; and (3) marginal Bayesian procedures as implemented…
Descriptors: Ability, Bayesian Statistics, Estimation (Mathematics), Maximum Likelihood Statistics
Peer reviewedWoodruff, David – Journal of Educational Measurement, 1991
Improvements are made on previous estimates for the conditional standard error of measurement in prediction, the conditional standard error of estimation (CSEE), and the conditional standard error of prediction (CSEP). Better estimates of how test length affects CSEE and CSEP are derived. (SLD)
Descriptors: Equations (Mathematics), Error of Measurement, Estimation (Mathematics), Mathematical Models
Seong, Tae-Je; And Others – 1997
This study was designed to compare the accuracy of three commonly used ability estimation procedures under the graded response model. The three methods, maximum likelihood (ML), expected a posteriori (EAP), and maximum a posteriori (MAP), were compared using a recovery study design for two sample sizes, two underlying ability distributions, and…
Descriptors: Ability, Comparative Analysis, Difficulty Level, Estimation (Mathematics)


