NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021
Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…
Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016
This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…
Descriptors: Test Reliability, Test Format, Test Items, Differences
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Xijuan; Savalei, Victoria – Educational and Psychological Measurement, 2016
Many psychological scales written in the Likert format include reverse worded (RW) items in order to control acquiescence bias. However, studies have shown that RW items often contaminate the factor structure of the scale by creating one or more method factors. The present study examines an alternative scale format, called the Expanded format,…
Descriptors: Factor Structure, Psychological Testing, Alternative Assessment, Test Items
Peer reviewed Peer reviewed
Benson, Philip G.; Dickinson, Terry L. – Educational and Psychological Measurement, 1983
The mixed standard scale is a rating format that allows researchers to count internally inconsistent response patterns. This study investigated the meaning of these counts, using 943 accountants as raters. The counts of internally inconsistent response patterns were not related to reliability as measured by Cronbach's alpha. (Author/BW)
Descriptors: Accountants, Adults, Error Patterns, Rating Scales
Peer reviewed Peer reviewed
Grosse, Martin E.; Wright, Benjamin D. – Educational and Psychological Measurement, 1985
A model of examinee behavior was used to generate hypotheses about the operation of true-false scores. Confirmation of hypotheses supported the contention that true-false scores contain an error component that makes these tests less reliable than multiple-choice tests. Examinee response style may invalidate a total true-false score. (Author/DWH)
Descriptors: Objective Tests, Response Style (Tests), Test Format, Test Reliability
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1982
Results in the engineering literature on "k out of n system reliability" can be used to characterize tests based on estimates of the probability of correctly determining whether the examinee knows the correct response. In particular, the minimum number of distractors required for multiple-choice tests can be empirically determined.…
Descriptors: Achievement Tests, Mathematical Models, Multiple Choice Tests, Test Format
Peer reviewed Peer reviewed
Campbell, Todd; And Others – Educational and Psychological Measurement, 1997
The construct validity of scores from the Bem Sex-Role Inventory was studied using confirmatory factor analysis methods on data from 791 subjects. Measurement characteristics of the long and short forms were studied, with the short form yielding more reliable scores, as has previously been indicated. (Author/SLD)
Descriptors: Adults, Construct Validity, Factor Structure, Scores
Peer reviewed Peer reviewed
Tollefson, Nona – Educational and Psychological Measurement, 1987
This study compared the item difficulty, item discrimination, and test reliability of three forms of multiple-choice items: (1) one correct answer; (2) "none of the above" as a foil; and (3) "none of the above" as the correct answer. Twelve items in the three formats were administered in a college statistics examination. (BS)
Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests
Peer reviewed Peer reviewed
Wainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997
The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)
Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning
Peer reviewed Peer reviewed
Green, Kathy; And Others – Educational and Psychological Measurement, 1982
Achievement test reliability and validity as a function of ability were determined for multiple sections of a large undergraduate French class. Results did not support previous arguments that decreasing the number of options results in a more efficient test for high-level examinees, but less efficient for low-level examinees. (Author/GK)
Descriptors: Academic Ability, Comparative Analysis, Higher Education, Multiple Choice Tests
Peer reviewed Peer reviewed
Hancock, Gregory R.; And Others – Educational and Psychological Measurement, 1993
Two-option multiple-choice vocabulary test items are compared with comparably written true-false test items. Results from a study with 111 high school students suggest that multiple-choice items provide a significantly more reliable measure than the true-false format. (SLD)
Descriptors: Ability, High School Students, High Schools, Objective Tests
Peer reviewed Peer reviewed
Straton, Ralph G.; Catts, Ralph M. – Educational and Psychological Measurement, 1980
Multiple-choice tests composed entirely of two-, three-, or four-choice items were investigated. Results indicated that number of alternatives per item was inversely related to item difficulty, but directly related to item discrimination. Reliability and standard error of measurement of three-choice item tests was equivalent or superior.…
Descriptors: Difficulty Level, Error of Measurement, Foreign Countries, Higher Education
Peer reviewed Peer reviewed
Schriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1989
Three studies explored the effects of grouping versus randomized items in questionnaires on internal consistency and test-retest reliability with samples of 80, 80, and 100, respectively, university students and undergraduates. The 2 correlational and 1 experimental studies were reasonably consistent in demonstrating that neither format was…
Descriptors: Classification, College Students, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
Paik, Chie; Michael, William B. – Educational and Psychological Measurement, 1999
Studied the internal consistency reliability and construct validity of scores on each of five dimensions of a Japanese version of the Dimensions of Self-Concept Scale. Results for 354 female high school students show that a five-factor oblique model accounts for the greatest proportion of covariance in the matrix of 15 subtests. Contains 20…
Descriptors: Construct Validity, Factor Structure, Females, Foreign Countries
Peer reviewed Peer reviewed
Aiken, Lewis R. – Educational and Psychological Measurement, 1983
Each of six forms of a 10-item teacher evaluation rating scale, having two to seven response categories per form, was administered to over 100 college students. Means of item responses and item variances increased with the number of response categories. Internal consistency of total scores did not change systematically. (Author/PN)
Descriptors: College Students, Higher Education, Item Analysis, Rating Scales
Previous Page | Next Page »
Pages: 1  |  2