ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	1

Descriptor

Item Analysis	19
Scoring Formulas	19
Test Items	19
Difficulty Level	9
Test Reliability	8
Guessing (Tests)	6
Higher Education	6
Multiple Choice Tests	6
Test Construction	6
Test Validity	5
Latent Trait Theory	4
Scoring	4
Testing Problems	4
Weighted Scores	4
Achievement Tests	3
Computer Programs	3
Confidence Testing	3
Equated Scores	3
Mathematical Models	3
Scaling	3
Scores	3
Testing	3
Ability	2
Adaptive Testing	2
College Entrance Examinations	2
More ▼

Source

Applied Psychological…	3
Educational and Psychological…	1
Evaluation in Education:…	1
Journal of Educational…	1
Language Testing	1
Review of Educational Research	1

Publication Type

Reports - Research	14
Speeches/Meeting Papers	7
Journal Articles	4
Information Analyses	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Matching Familiar Figures Test	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Developing, Analyzing, and Using Distractors for Multiple-Choice Tests in Education: A Comprehensive Review

Peer reviewed

Direct link

Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017

Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…

Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns

Biserial Weights: A New Approach to Test Item Option Weighting

Peer reviewed

Claudy, John G. – Applied Psychological Measurement, 1978

Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)

Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items

Developing Homogeneous TOEFL Scales by Multidimensional Scaling.

Peer reviewed

Oltman, Phillip K.; Stricker, Lawrence J. – Language Testing, 1990

A recent multidimensional scaling analysis of the Test of English-as-a-Foreign-Language (TOEFL) item response data identified clusters of items in the test sections that, being more homogeneous than their parent sections, might be better for diagnostic use. The analysis was repeated using different scoring techniques. Results diverged only for…

Descriptors: English (Second Language), Item Analysis, Language Tests, Scaling

New Directions in Matching Familiar Figures Test Research Resulting From Scoring and Item Analyses.

Download full text

Brinzer, Raymond J. – 1979

The problem engendered by the Matching Familiar Figures (MFF) Test is one of instrument integrity (II). II is delimited by validity, reliability, and utility of MFF as a measure of the reflective-impulsive construct. Validity, reliability and utility of construct assessment may be improved by utilizing: (1) a prototypic scoring model that will…

Descriptors: Conceptual Tempo, Difficulty Level, Item Analysis, Research Methodology

The Impact of Item Deletion on Equating Conversions and Reported Score Distributions.

Peer reviewed

Dorans, Neil J. – Journal of Educational Measurement, 1986

The analytical decomposition demonstrates how the effects of item characteristics, test properties, individual examinee responses, and rounding rules combine to produce the item deletion effect on the equating/scaling function and candidate scores. The empirical portion of the report illustrates the effects of item deletion on reported score…

Descriptors: Difficulty Level, Equated Scores, Item Analysis, Latent Trait Theory

The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring

Peer reviewed

Kane, Michael; Moloney, James – Applied Psychological Measurement, 1978

The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)

Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests

Some Exploratory Indices for Selection of a Test Equating Method.

Jaeger, Richard M. – 1980

Five statistical indices are developed and described which may be used for determining (1) when linear equating of two approximately parallel tests is adequate, and (2) whan a more complex method such as equipercentile equating must be used. The indices were based on: (1) similarity of cumulative score distributions; (2) shape of the raw-score to…

Descriptors: College Entrance Examinations, Difficulty Level, Equated Scores, Higher Education

Robbins-Monro Procedures for Tailored Testing

Peer reviewed

Lord, Frederic M. – Educational and Psychological Measurement, 1971

Descriptors: Ability, Adaptive Testing, Computer Oriented Programs, Difficulty Level

The Effect of Misinformation on Item Discrimination Indices and Estimation Priorities of Multiple-Choice Test Scores.

Lowry, Stephen R. – 1979

A specially designed answer format was used for three tests in a college level agriculture class of 19 students to record responses to three things about each item: (1) the student's choice of the best answer; (2) the degree of certainty with which the answer was chosen; and (3) all the answer choices which the student was certain were incorrect.…

Descriptors: Achievement Tests, Confidence Testing, Guessing (Tests), Higher Education

Multiple Choice: A State of the Art Report

Wood, Robert – Evaluation in Education: International Progress, 1977

The author surveys literature and practice, primarily in Great Britain and the United States, about multiple-choice testing, comments on criticisms, and defends the state of the art. Varous item types, item writing, test instructions and scoring formulas, item analysis, and test construction are discussed. An extensive bibliography is appended.…

Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Scoring Formulas

The Effect of Keying All Options Correct on Equating Functions and Scores.

Download full text

Lenel, Julia C.; Gilmer, Jerry S. – 1986

In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…

Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)

The Use of Precalibrated Item Bank to Establish and Maintain Cutoff Scores: A Case Study of the Florida Teacher Certification Examination.

Download full text

Legg, Sue M. – 1982

A case study of the Florida Teacher Certification Examination (FTCE) program was described to assist others launching the development of large scale item banks. FTCE has four subtests: Mathematics, Reading, Writing, and Professional Education. Rasch calibrated item banks have been developed for all subtests except Writing. The methods used to…

Descriptors: Cutting Scores, Difficulty Level, Field Tests, Item Analysis

Measurement of Moral Judgment: Using Stimulus Pairs to Estimate Inter-stage Distances.

Sullivan, Arthur P. – 1978

Sullivan's Ethical Reasoning Scale contains three dilemmas with response pairs representing Kohlberg's stages of moral development. In Kohlberg's first three stages, goodness is equated with lack of punishment, usefulness, and approval, respectively. Good is seen as conformity to rule and ruler in stage four, and stage five comprises…

Descriptors: Adolescents, Adults, Attitude Measures, Conflict Resolution

Item-Option Weighting of Achievement Tests: Comparative Study of Methods.

Peer reviewed

Downey, Ronald G. – Applied Psychological Measurement, 1979

This research attempted to interrelate several methods of producing option weights (i.e., Guttman internal and external weights and judges' weights) and examined their effects on reliability and on concurrent, predictive, and face validity. It was concluded that option weighting offered limited, if any, improvement over unit weighting. (Author/CTM)

Descriptors: Achievement Tests, Answer Keys, Comparative Testing, High Schools

The Evaluation of Mastery Test Items. Final Report.

Download full text

Brennan, Robert L. – 1974

The first four chapters of this report primarily provide an extensive, critical review of the literature with regard to selected aspects of the criterion-referenced and mastery testing fields. Major topics treated include: (a) definitions, distinctions, and background, (b) the relevance of classical test theory, (c) validity and procedures for…

Descriptors: Computer Programs, Confidence Testing, Criterion Referenced Tests, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Brennan, Robert L.	1
Brinzer, Raymond J.	1
Bulut, Okan	1
Claudy, John G.	1
Dorans, Neil J.	1
Downey, Ronald G.	1
Gierl, Mark J.	1
Gilmer, Jerry S.	1
Guo, Qi	1
Jaeger, Richard M.	1
Kane, Michael	1
Kingston, Neal M.	1
Legg, Sue M.	1
Lenel, Julia C.	1
Lord, Frederic M.	1
Lowry, Stephen R.	1
Mitchell, Virginia P.	1
Moloney, James	1
Oltman, Phillip K.	1
Rippey, Robert M.	1
Smith, Richard M.	1
Stricker, Lawrence J.	1
Sullivan, Arthur P.	1
Vale, C. David	1
Weiss, David J.	1
More ▼