NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)2
Since 2007 (last 20 years)4
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kalinowski, Steven T.; Willoughby, Shannon – Journal of Research in Science Teaching, 2019
We present a multiple-choice test, the Montana State University Formal Reasoning Test (FORT), to assess college students' scientific reasoning ability. The test defines scientific reasoning to be equivalent to formal operational reasoning. It contains 20 questions divided evenly among five types of problems: control of variables, hypothesis…
Descriptors: Science Tests, Test Construction, Science Instruction, Introductory Courses
Peer reviewed Peer reviewed
Direct linkDirect link
Fiedler, Daniela; Tröbst, Steffen; Harms, Ute – CBE - Life Sciences Education, 2017
Students of all ages face severe conceptual difficulties regarding key aspects of evolution-- the central, unifying, and overarching theme in biology. Aspects strongly related to abstract "threshold" concepts like randomness and probability appear to pose particular difficulties. A further problem is the lack of an appropriate instrument…
Descriptors: College Students, Concept Formation, Probability, Evolution
Peer reviewed Peer reviewed
Direct linkDirect link
Sadaghiani, Homeyra R.; Pollock, Steven J. – Physical Review Special Topics - Physics Education Research, 2015
As part of an ongoing investigation of students' learning in first semester upper-division quantum mechanics, we needed a high-quality conceptual assessment instrument for comparing outcomes of different curricular approaches. The process of developing such a tool started with converting a preliminary version of a 14-item open-ended quantum…
Descriptors: Science Instruction, Quantum Mechanics, Mechanics (Physics), Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Stewart, Jeffrey; White, David A. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2011
Multiple-choice tests such as the Vocabulary Levels Test (VLT) are often viewed as a preferable estimator of vocabulary knowledge when compared to yes/no checklists, because self-reporting tests introduce the possibility of students overreporting or underreporting scores. However, multiple-choice tests have their own unique disadvantages. It has…
Descriptors: Guessing (Tests), Scoring Formulas, Multiple Choice Tests, Test Reliability
Anderson, Richard Ivan – Journal of Computer-Based Instruction, 1982
Describes confidence testing methods (confidence weighting, probabilistic marking, multiple alternative selection) as alternative to computer-based, multiple choice tests and explains potential benefits (increased reliability, improved examinee evaluation of alternatives, extended diagnostic information and remediation prescriptions, happier…
Descriptors: Computer Assisted Testing, Confidence Testing, Multiple Choice Tests, Probability
Peer reviewed Peer reviewed
Zimmerman, Donald W. – Educational and Psychological Measurement, 1985
A computer program simulated guessing on multiple-choice test items and calculated deviation IQ's from observed scores which contained a guessing component. Extensive variability in deviation IQ's due entirely to chance was found. (Author/LMO)
Descriptors: Computer Simulation, Error of Measurement, Guessing (Tests), Intelligence Quotient
Rippey, Robert M. – J Educ Meas, 1970
Descriptors: Multiple Choice Tests, Prediction, Probability, Scoring
Peer reviewed Peer reviewed
Hansen, Richard – Journal of Educational Measurement, 1971
The relationship between certain personality variables and the degree to which examines display certainty in their responses was investigated. (Author)
Descriptors: Guessing (Tests), Individual Characteristics, Multiple Choice Tests, Personality Assessment
Frary, Robert B.; Zimmerman, Donald W. – Educ Psychol Meas, 1970
Descriptors: Error of Measurement, Guessing (Tests), Multiple Choice Tests, Probability
Wilcox, Rand R. – 1982
This document contains three papers from the Methodology Project of the Center for the Study of Evaluation. Methods for characterizing test accuracy are reported in the first two papers. "Bounds on the K Out of N Reliability of a Test, and an Exact Test for Hierarchically Related Items" describes and illustrates how an extension of a…
Descriptors: Educational Testing, Evaluation Methods, Guessing (Tests), Latent Trait Theory
PDF pending restoration PDF pending restoration
Kane, Michael T.; Moloney, James M. – 1976
The Answer-Until-Correct (AUC) procedure has been proposed in order to increase the reliability of multiple-choice items. A model for examinees' behavior when they must respond to each item until they answer it correctly is presented. An expression for the reliability of AUC items, as a function of the characteristics of the item and the scoring…
Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests
Brennan, Robert L,; Lockwood, Robert E. – 1979
Procedures for determining cutting scores have been proposed by Angoff and by Nedelsky. Nedelsky's approach requires that a rater examine each distractor within a test item to determine the probability of a minimally competent examinee answering correctly; whereas Angoff uses a judgment based on the whole item, rather than each of its components.…
Descriptors: Achievement Tests, Comparative Analysis, Cutting Scores, Guessing (Tests)
Shuford, Emir H., Jr.; Brown, Thomas A. – 1974
A student's choice of an answer to a test question is a coarse measure of his knowledge about the subject matter of the question. Much finer measurement might be achieved if the student were asked to estimate, for each possible answer, the probability that it is the correct one. Such a procedure could yield two classes of benefits: (a) students…
Descriptors: Bias, Computer Programs, Confidence Testing, Decision Making
Levine, Michael V.; Rubin, Donald B. – 1976
Appropriateness indexes (statistical formulas) for detecting suspiciously high or low scores on aptitude tests were presented, based on a simulation of the Scholastic Aptitude Test (SAT) with 3,000 simulated scores--2,800 normal and 200 suspicious. The traditional index--marginal probability--uses a model for the normal examinee's test-taking…
Descriptors: Academic Ability, Aptitude Tests, College Entrance Examinations, High Schools