Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Scoring Formulas | 68 |
| Test Items | 68 |
| Multiple Choice Tests | 26 |
| Test Reliability | 25 |
| Guessing (Tests) | 22 |
| Item Analysis | 19 |
| Difficulty Level | 18 |
| Higher Education | 17 |
| Test Construction | 16 |
| Scoring | 14 |
| Testing Problems | 12 |
| More ▼ | |
Source
Author
| Angoff, William H. | 3 |
| Plake, Barbara S. | 3 |
| Frary, Robert B. | 2 |
| Huynh, Huynh | 2 |
| Schrader, William B. | 2 |
| Smith, Richard M. | 2 |
| Weiss, David J. | 2 |
| Aaronson, May | 1 |
| Aghbar, Ali A. | 1 |
| Aiken, Lewis R. | 1 |
| Alliegro, Marissa C. | 1 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 4 |
| Postsecondary Education | 4 |
| Secondary Education | 1 |
Audience
| Researchers | 7 |
| Practitioners | 2 |
| Teachers | 2 |
| Policymakers | 1 |
Laws, Policies, & Programs
| Education for All Handicapped… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
MacCann, Robert G. – Psychometrika, 2004
For (0, 1) scored multiple-choice tests, a formula giving test reliability as a function of the number of item options is derived, assuming the "knowledge or random guessing model," the parallelism of the new and old tests (apart from the guessing probability), and the assumptions of classical test theory. It is shown that the formula is a more…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Reliability, Test Theory
Anderson, Richard Ivan – 1980
Features of a probabilistic testing system that has been implemented on the "cerl" PLATO computer system are described. The key feature of the system is the manner in which an examinee responds to each test item; the examinee distributes probabilities among the alternatives of each item by positioning a small square on or within an…
Descriptors: Computer Assisted Testing, Data Collection, Feedback, Probability
Peer reviewedWillson, Victor L. – Educational and Psychological Measurement, 1982
The Serlin-Kaiser procedure is used to complete a principal components solution for scoring weights for all options of a given item. Coefficient alpha is maximized for a given multiple choice test. (Author/GK)
Descriptors: Analysis of Covariance, Factor Analysis, Multiple Choice Tests, Scoring Formulas
Angoff, William H. – 1985
This paper points out that there are certain generalizations about directions for guessing and methods of scoring that require that data be derived from random groups design. It supports the viewpoint that it is neither sufficient nor appropriate to make such generalizations on the basis of an analysis of scores obtained from the answer sheets of…
Descriptors: Correlation, Guessing (Tests), Research Design, Scoring Formulas
Brinzer, Raymond J. – 1979
The problem engendered by the Matching Familiar Figures (MFF) Test is one of instrument integrity (II). II is delimited by validity, reliability, and utility of MFF as a measure of the reflective-impulsive construct. Validity, reliability and utility of construct assessment may be improved by utilizing: (1) a prototypic scoring model that will…
Descriptors: Conceptual Tempo, Difficulty Level, Item Analysis, Research Methodology
Peer reviewedDuncan, George T.; Milton, E. O. – Psychometrika, 1978
A multiple-answer multiple-choice test is one which offers several alternate choices for each stem and any number of those choices may be considered to be correct. In this article, a class of scoring procedures called the binary class is discussed. (Author/JKS)
Descriptors: Answer Keys, Measurement Techniques, Multiple Choice Tests, Scoring Formulas
Peer reviewedDorans, Neil J. – Journal of Educational Measurement, 1986
The analytical decomposition demonstrates how the effects of item characteristics, test properties, individual examinee responses, and rounding rules combine to produce the item deletion effect on the equating/scaling function and candidate scores. The empirical portion of the report illustrates the effects of item deletion on reported score…
Descriptors: Difficulty Level, Equated Scores, Item Analysis, Latent Trait Theory
Peer reviewedKane, Michael; Moloney, James – Applied Psychological Measurement, 1978
The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)
Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests
Peer reviewedHuynh, Huynh – Journal of Educational Statistics, 1986
Under the assumptions of classical measurement theory and the condition of normality, a formula is derived for the reliability of composite scores. The formula represents an extension of the Spearman-Brown formula to the case of truncated data. (Author/JAZ)
Descriptors: Computer Simulation, Error of Measurement, Expectancy Tables, Scoring Formulas
Peer reviewedGross, Leon J. – Evaluation and the Health Professions, 1982
Despite the 50 percent probability of a correctly guessed response, a multiple true-false examination should provide sufficient score variability for adequate discrimination without formula scoring. This scoring system directs examinees to respond to each item, with their scores based simply on the number of correct responses. (Author/CM)
Descriptors: Achievement Tests, Guessing (Tests), Health Education, Higher Education
Pomplun, Mark; And Others – 1992
This study evaluated the use of bivariate matching as a solution to the problem of studying differential item functioning (DIF) with formula scored tests. Using Scholastic Aptitude Test verbal data with large samples, both male/female and black/white group comparisons were investigated. Mantel-Haenszel (MH) delta-(D) DIF values and DIF category…
Descriptors: Blacks, Criteria, Females, Item Bias
Hutchinson, T. P. – 1984
One means of learning about the processes operating in a multiple choice test is to include some test items, called nonsense items, which have no correct answer. This paper compares two versions of a mathematical model of test performance to interpret test data that includes both genuine and nonsense items. One formula is based on the usual…
Descriptors: Foreign Countries, Guessing (Tests), Mathematical Models, Multiple Choice Tests
Peer reviewedAiken, Lewis R. – Educational and Psychological Measurement, 1980
Procedures for computing content validity and consistency reliability coefficients and determining the statistical significance of these coefficients are described. Procedures employing the multinomial probability distribution for small samples and normal curve probability estimates for large samples, can be used where judgments are made on…
Descriptors: Computer Programs, Measurement Techniques, Probability, Questionnaires
Peer reviewedPoizner, Sharon B.; And Others – Applied Psychological Measurement, 1978
Binary, probability, and ordinal scoring procedures for multiple-choice items were examined. In two situations, it was found that both the probability and ordinal scoring systems were more reliable than the binary scoring method. (Author/CTM)
Descriptors: Confidence Testing, Guessing (Tests), Higher Education, Multiple Choice Tests
Willingness to Answer Multiple-Choice Questions as Manifested Both in Genuine and in Nonsense Items.
Peer reviewedFrary, Robert B.; Hutchinson, T.P. – Educational and Psychological Measurement, 1982
Alternate versions of Hutchinson's theory were compared, and one which implies the existence of partial knowledge was found to be better than one which implies that an appropriate measure of ability is obtained by applying the conventional correction for guessing. (Author/PN)
Descriptors: Guessing (Tests), Latent Trait Theory, Multiple Choice Tests, Scoring Formulas

Direct link
