Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Scoring Formulas | 76 |
| Test Interpretation | 76 |
| Test Reliability | 26 |
| Multiple Choice Tests | 20 |
| Test Validity | 18 |
| Statistical Analysis | 17 |
| Guessing (Tests) | 15 |
| Test Construction | 14 |
| Scores | 13 |
| Scoring | 13 |
| Testing Problems | 13 |
| More ▼ | |
Source
Author
| Echternacht, Gary | 4 |
| Frary, Robert B. | 4 |
| Boldt, Robert F. | 2 |
| Powell, J. C. | 2 |
| Abu-Sayf, F. K. | 1 |
| Angoff, William H. | 1 |
| Barta, Maryann B. | 1 |
| Bayuk, Robert J. | 1 |
| Berk, Ronald A. | 1 |
| Bormuth, John R. | 1 |
| Campbell, Brian | 1 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 3 |
| Elementary Secondary Education | 2 |
| Postsecondary Education | 2 |
| Elementary Education | 1 |
| Grade 7 | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 2 |
| Policymakers | 1 |
| Practitioners | 1 |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedReynolds, Cecil R.; Clark, Julia H. – Psychology in the Schools, 1986
Describes a method using age equivalents and standard scores to recreate the full range of variability in the scores of high-functioning individuals. The method allows for a more complete interpretation of performance that can lead to better educational and therapeutic programing. (Author/ABB)
Descriptors: Children, Elementary Secondary Education, Gifted, High Achievement
Purves, Alan C.; And Others – 1990
After establishing a theoretical depiction of the domain of literature learning, a study developed test packages which examined: (1) the relationship among multiple choice, short open-ended, and long open-ended responses; (2) whether there would be differences according to the genres; (3) the relationship between literary and non-literary texts,…
Descriptors: Educational Research, Evaluation Methods, High Schools, Literary Genres
Peer reviewedDiamond, James J. – Journal of Educational Measurement, 1975
Investigates the reliability and validity of scores yielded from a new scoring formula. (Author/DEP)
Descriptors: Guessing (Tests), Multiple Choice Tests, Objective Tests, Scoring
Divgi, D. R. – 1980
A method is proposed for providing an absolute, in contrast to comparative, evaluation of how well two tests are equated by transforming their raw scores into a particular common scale. The method is direct, not requiring creation of a standard for comparison; expresses its results in scaled rather than raw scores, and allows examination of the…
Descriptors: Equated Scores, Evaluation Criteria, Item Analysis, Latent Trait Theory
Peer reviewedLord, Frederic M. – Journal of Educational Measurement, 1984
Four methods are outlined for estimating or approximating from a single test administration the standard error of measurement of number-right test score at specified ability levels or cutting scores. The methods are illustrated and compared on one set of real test data. (Author)
Descriptors: Academic Ability, Cutting Scores, Error of Measurement, Scoring Formulas
Willingness to Answer Multiple-Choice Questions as Manifested Both in Genuine and in Nonsense Items.
Peer reviewedFrary, Robert B.; Hutchinson, T.P. – Educational and Psychological Measurement, 1982
Alternate versions of Hutchinson's theory were compared, and one which implies the existence of partial knowledge was found to be better than one which implies that an appropriate measure of ability is obtained by applying the conventional correction for guessing. (Author/PN)
Descriptors: Guessing (Tests), Latent Trait Theory, Multiple Choice Tests, Scoring Formulas
Veitch, William R. – 1979
The one parameter latent trait theory of Georg Rasch has two assumptions: that student abilities can be measured on an equal interval scale, and that the success of a student with a given item is a function of student achievement and item difficulty. The grade four Michigan Educational Assessment Program reading test was designed to measure…
Descriptors: Cutting Scores, Educational Assessment, Intermediate Grades, Item Analysis
Hambleton, Ronald K.; Novick, Melvin R. – 1972
In this paper, an attempt has been made to synthesize some of the current thinking in the area of criterion-referenced testing as well as to provide the beginning of an integration of theory and method for such testing. Since criterion-referenced testing is viewed from a decision-theoretic point of view, approaches to reliability and validity…
Descriptors: Criterion Referenced Tests, Measurement Instruments, Measurement Techniques, Scaling
Koehler, Roger A. – 1974
A potentially valuable measure of overconfidence on probabilistic multiple-choice tests was evaluated. The measure of overconfidence was based on probabilistic responses to nonsense items embedded in a vocabulary test. The test was administered under both confidence response and conventional choice response directions to 208 undergraduate…
Descriptors: Confidence Testing, Guessing (Tests), Measurement Techniques, Multiple Choice Tests
Jones, Bernard G.; Gramenz, Gary W. – Spectrum, 1983
Describes procedure for combining Stanford Achievement Test items with local supplementary items to measure individual and aggregate student performance in mathematics. Two reports are generated from the test results: a diagnostic report of student mastery of each objective and an overall score. (TE)
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Item Analysis, Item Banks
Peer reviewedUpshur, John A.; Turner, Carolyn E. – ELT Journal, 1995
Reviews the place of rating scales in second-language measurement and summarizes some of the problems associated with them. Standard and alternative scales were studied. High agreement among raters can be achieved even under conditions not favorable to high interrater reliability. The full range of score categories are effectively utilized. (17…
Descriptors: Evaluation Problems, Interrater Reliability, Language Tests, Measurement Techniques
Cole, Nancy S. – 1982
The advantages and disadvantages of grade equivalent (GE) scores are explored, including appropriate uses for GE type scores and how to bring current GE scales closer to the type of information educators appear to desire. Although GE scores are not an equal interval scale, not comparable across school subjects, and do not indicate the grade level…
Descriptors: Academic Achievement, Elementary Secondary Education, Evaluation Methods, Formative Evaluation
Lowry, Stephen R. – 1977
The effects of luck and misinformation on ability of multiple-choice test scores to estimate examinee ability were investigated. Two measures of examinee ability were defined. Misinformation was shown to have little effect on ability of raw scores and a substantial effect on ability of corrected-for-guessing scores to estimate examinee ability.…
Descriptors: Ability, College Students, Guessing (Tests), Multiple Choice Tests
Boldt, Robert F. – 1971
One formulation of confidence scoring requires the examinee to indicate as a number his personal probability of the correctness of each alternative in a multiple-choice test. For this formulation, a linear transformation of the logarithm of the correct response is maximized if the examinee reports accurately his personal probability. To equate…
Descriptors: Confidence Testing, Guessing (Tests), Multiple Choice Tests, Probability
Kahl, Peter W. – Neusprachliche Mitteilungen, 1971
Descriptors: Achievement Tests, English (Second Language), Language Tests, Scoring


