Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 5 |
| Since 2007 (last 20 years) | 23 |
Descriptor
| Scores | 60 |
| Scoring Formulas | 60 |
| Multiple Choice Tests | 16 |
| Statistical Analysis | 15 |
| Test Interpretation | 13 |
| Test Reliability | 12 |
| Guessing (Tests) | 11 |
| Test Validity | 9 |
| Correlation | 8 |
| Testing Problems | 8 |
| Academic Achievement | 7 |
| More ▼ | |
Source
Author
| Frary, Robert B. | 3 |
| Acar, Selcuk | 1 |
| Adams, Curt M. | 1 |
| Albanese, Mark A. | 1 |
| Amsbary, Michelle | 1 |
| Annis, Terri | 1 |
| Arkin, Robert M. | 1 |
| Baer, Justin | 1 |
| Bakker, J. | 1 |
| Baldi, Stephane, Ed. | 1 |
| Bardhoshi, Gerta | 1 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 10 |
| Postsecondary Education | 8 |
| Secondary Education | 4 |
| Elementary Secondary Education | 2 |
| High Schools | 2 |
| Adult Education | 1 |
| Elementary Education | 1 |
| Grade 11 | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
Audience
| Researchers | 1 |
Location
| California | 2 |
| Czech Republic | 1 |
| Denmark | 1 |
| Israel | 1 |
| North Carolina | 1 |
| Oklahoma | 1 |
| Thailand | 1 |
| United Kingdom (Great Britain) | 1 |
| United States | 1 |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 2 |
| Serrano v Priest | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedFrary, Robert B. – Applied Psychological Measurement, 1980
Six scoring methods for assigning weights to right or wrong responses according to various instructions given to test takers are analyzed with respect to expected change scores and the effect of various levels of information and misinformation. Three of the methods provide feedback to the test taker. (Author/CTM)
Descriptors: Guessing (Tests), Knowledge Level, Multiple Choice Tests, Scores
Peer reviewedDrasgow, Fritz; And Others – Applied Psychological Measurement, 1989
Multilinear formula scoring (MFS) is reviewed, with emphasis on estimating option characteristic curves (OCSs). MFS was used to estimate OCSs for the arithmetic reasoning subtest of the Armed Services Vocational Aptitude Battery for 2,978 examinees. A second analysis obtained OCSs for simulated data. The use of MFS is discussed. (SLD)
Descriptors: Estimation (Mathematics), Mathematical Models, Multiple Choice Tests, Scores
Livingston, Samuel A. – 1981
The standard error of measurement (SEM) is a measure of the inconsistency in the scores of a particular group of test-takers. It is largest for test-takers with scores ranging in the 50 percent correct bracket; with nearly perfect scores, it is smaller. On tests used to make pass/fail decisions, the test-takers' scores tend to cluster in the range…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Formulas, Pass Fail Grading
Peer reviewedHansen, Richard – Journal of Educational Measurement, 1971
The relationship between certain personality variables and the degree to which examines display certainty in their responses was investigated. (Author)
Descriptors: Guessing (Tests), Individual Characteristics, Multiple Choice Tests, Personality Assessment
Frary, Robert B.; And Others – 1985
Students in an introductory college course (n=275) responded to equivalent 20-item halves of a test under number-right and formula-scoring instructions. Formula scores of those who omitted items overaged about one point lower than their comparable (formula adjusted) scores on the test half administered under number-right instructions. In contrast,…
Descriptors: Guessing (Tests), Higher Education, Multiple Choice Tests, Questionnaires
Knapp, Thomas R. – Measurement and Evaluation in Guidance, 1980
Supports arguments against general use of change scores and recommends the Lord/McNemar estimates of true change. Provides a numerical example illustrating the reliability problem and the problem of the prediction of true change from various linear composites of initial and final measures. (Author)
Descriptors: Counseling Techniques, Literature Reviews, Pretests Posttests, Research Methodology
Rand, Earl – 1978
A project is described that was undertaken to investigate: (1) how long a cloze test has to be to achieve optimum reliability without wasting anyone's time; and (2) how cloze tests should be scored in order to obtain maximum reliability. The literature recommended 50 deletions in order to provide for an adequate sample of examinee's abilities; it…
Descriptors: Cloze Procedure, English (Second Language), Higher Education, Language Research
Plake, Barbara S.; And Others – 1980
Number right and elimination scores were analyzed on a 48-item college level mathematics test that was assembled from pretest data in three forms by varying the item orderings: easy-hard, uniform, or random. Half of the forms contained information explaining the item arrangement and suggesting strategies for taking the test. Several anxiety…
Descriptors: Difficulty Level, Higher Education, Multiple Choice Tests, Quantitative Tests
Childs, Roy – 1976
The norm-referenced score scale used by the National Foundation for Educational Research (NFER) is described. The usefulness of standardized scores is explained by a simple numerical example, and the formulas and computations are shown for calculating a mean, a standard deviation, and a deviation or z score. The need for a representative sample is…
Descriptors: Computation, Foreign Countries, Guides, Mathematical Formulas
Peer reviewedArkin, Robert M.; Walts, Elizabeth A. – Journal of Educational Psychology, 1983
The effects of corrective testing and how such feedback might affect high- and low-test-anxious students differently are indicated. Subjects were 286 college students in three classes--one using mastery testing and two using multiple choice tests. (Author/PN)
Descriptors: Attribution Theory, Feedback, Higher Education, Mastery Tests
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items
Cole, Nancy S. – 1982
The advantages and disadvantages of grade equivalent (GE) scores are explored, including appropriate uses for GE type scores and how to bring current GE scales closer to the type of information educators appear to desire. Although GE scores are not an equal interval scale, not comparable across school subjects, and do not indicate the grade level…
Descriptors: Academic Achievement, Elementary Secondary Education, Evaluation Methods, Formative Evaluation
Cross, Lawrence H.; Frary, Robert B. – 1976
It has been demonstrated that corrected-for-guessing scores will be superior to number-right scores in providing estimates of examinee standing on the trait measured by a multiple-choice test, if it can be assumed that examinees can and will comply with the appropriate directions. The purpose of the present study was to test the validity of that…
Descriptors: Achievement Tests, Guessing (Tests), Individual Characteristics, Multiple Choice Tests
Boldt, Robert F. – 1971
One formulation of confidence scoring requires the examinee to indicate as a number his personal probability of the correctness of each alternative in a multiple-choice test. For this formulation, a linear transformation of the logarithm of the correct response is maximized if the examinee reports accurately his personal probability. To equate…
Descriptors: Confidence Testing, Guessing (Tests), Multiple Choice Tests, Probability
Foegen, Anne – Diagnostique, 2000
A study involving 105 sixth-graders examined three aspects of technical adequacy with respect to two general outcome measures in mathematics: the effects of aggregating scores and correcting for random guessing on reliability and validity and the extent to which the measures were sensitive to changes in performance. (Contains references.)…
Descriptors: Curriculum Based Assessment, Disabilities, Grade 6, Mathematics

Direct link
