Descriptor
Author
Publication Type
| Reports - Research | 14 |
| Journal Articles | 4 |
| Speeches/Meeting Papers | 4 |
| Numerical/Quantitative Data | 1 |
Education Level
Audience
| Researchers | 3 |
Location
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 2 |
| Iowa Tests of Basic Skills | 1 |
| Medical College Admission Test | 1 |
| SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Peer reviewedFeldt, Leonard S. – Applied Measurement in Education, 2002
Considers the degree of bias in testlet-based alpha (internal consistency reliability) through hypothetical examples and real test data from four tests of the Iowa Tests of Basic Skills. Presents a simple formula for computing a testlet-based congeneric coefficient. (SLD)
Descriptors: Estimation (Mathematics), Reliability, Statistical Bias, Test Format
Peer reviewedHambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994
The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)
Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory
Veldkamp, Bernard P. – 1998
In this paper, a mathematical programming approach is presented for the assembly of ability tests measuring multiple traits. The values of the variance functions of the estimators of the traits are minimized, while test specifications are met. The approach is based on Lagrangian relaxation techniques and provides good results for the two…
Descriptors: Ability, Estimation (Mathematics), Foreign Countries, Item Banks
Livingston, Samuel A. – 1984
Much previously published material for estimating the reliability of classification has been based on the assumption that a test consists of a known number of equally weighted items. The test score is the number of those items answered correctly. These methods cannot be used with classifications based on weighted composite scores, especially if…
Descriptors: Equated Scores, Essay Tests, Estimation (Mathematics), Mathematical Models
Peer reviewedVan Der Linden, Wim J. – Educational and Psychological Measurement, 1983
This paper focuses on mixtures of two binomials with one known success parameter. It is shown how moment estimators can be obtained for the remaining unknown parameters of such mixtures, and results are presented from a Monte Carlo study carried out to explore the statistical properties of these estimators. (PN)
Descriptors: Educational Testing, Error of Measurement, Estimation (Mathematics), Guessing (Tests)
Wainer, Howard – 1985
It is important to estimate the number of examinees who reached a test item, because item difficulty is defined by the number who answered correctly divided by the number who reached the item. A new method is presented and compared to the previously used definition of three categories of response to an item: (1) answered; (2) omitted--a…
Descriptors: College Entrance Examinations, Difficulty Level, Estimation (Mathematics), High Schools
Mitchell, Karen J.; Anderson, Judith A. – 1987
The Association of American Medical Colleges is conducting research to develop, implement, and evaluate a Medical College Admission Test (MCAT) essay testing program. Essay administration in the spring and fall of 1985 and 1986 suggested that additional research was needed on the development of topics which elicit similar skills and meet standard…
Descriptors: College Entrance Examinations, Essay Tests, Estimation (Mathematics), Generalizability Theory
Wingersky, Marilyn S.; Lord, Frederic M. – 1983
The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory
Schaefer, Mary M.; Gross, Susan K. – 1983
Viewing the reliability for criterion-referenced tests as that of mastery classification decisions, three models for determining reliability were examined using two test administrations so that two estimates could be compared to a standard. A major purpose of the research was to determine how several reliability coefficients (coefficient kappa, an…
Descriptors: Comparative Analysis, Correlation, Criterion Referenced Tests, Cutting Scores
Weiss, David J.; McBride, James R. – 1983
Monte Carlo simulation was used to investigate score bias and information characteristics of Owen's Bayesian adaptive testing strategy, and to examine possible causes of score bias. Factors investigated in three related studies included effects of item discrimination, effects of fixed vs. variable test length, and effects of an accurate prior…
Descriptors: Ability Identification, Adaptive Testing, Bayesian Statistics, Computer Assisted Testing
Test-Retest Analyses of the Test of English as a Foreign Language. TOEFL Research Reports Report 45.
Henning, Grant – 1993
This study provides information about the total and component scores of the Test of English as a Foreign Language (TOEFL). First, the study provides comparative global and component estimates of test-retest, alternate-form, and internal-consistency reliability, controlling for sources of measurement error inherent in the examinees and the testing…
Descriptors: Difficulty Level, English (Second Language), Error of Measurement, Estimation (Mathematics)
Peer reviewedPlake, Barbara S.; Melican, Gerald J. – Educational and Psychological Measurement, 1989
The impact of overall test length and difficulty on the expert judgments of item performance by the Nedelsky method were studied. Five university-level instructors predicting the performance of minimally competent candidates on a mathematics examination were fairly consistent in their assessments regardless of length or difficulty of the test.…
Descriptors: Difficulty Level, Estimation (Mathematics), Evaluators, Higher Education
Bejar, Isaac I. – 1985
The Test of English as a Foreign Language (TOEFL) was used in this study, which attempted to develop a new methodology for assessing the speededness of right-scored tests. Traditional procedures of assessing speededness have assumed that the test is scored under formula-scoring instructions; this approach is not always appropriate. In this study,…
Descriptors: College Entrance Examinations, English (Second Language), Estimation (Mathematics), Evaluation Methods
Reshetar, Rosemary A.; And Others – 1992
This study examined performance of a simulated computerized adaptive test that was designed to help direct the development of a medical recertification examination. The item pool consisted of 229 single-best-answer items from a random sample of 3,000 examinees, calibrated using the two-parameter logistic model. Examinees' responses were known. For…
Descriptors: Adaptive Testing, Classification, Computer Assisted Testing, Computer Simulation


