ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	3

Descriptor

Test Format	22
Test Reliability	22
Test Items	14
Test Construction	9
Test Validity	8
Higher Education	7
Multiple Choice Tests	6
High School Students	5
High Schools	5
Item Analysis	5
Scores	4
Difficulty Level	3
Factor Structure	3
Foreign Countries	3
Objective Tests	3
Questionnaires	3
Rating Scales	3
Ability	2
Adults	2
College Students	2
Comparative Analysis	2
Comparative Testing	2
Construct Validity	2
Distractors (Tests)	2
Estimation (Mathematics)	2
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	22
Reports - Research	17
Reports - Evaluative	5
Speeches/Meeting Papers	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Canada	1
Japan	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

Assessments and Surveys

Beck Depression Inventory	1
Bem Sex Role Inventory	1
Dimensions of Self Concept	1
Rosenberg Self Esteem Scale	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Peer reviewed

Direct link

Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…

Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring

Can Reliability of Multiple Component Measuring Instruments Depend on Response Option Presentation Mode?

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016

This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…

Descriptors: Test Reliability, Test Format, Test Items, Differences

Improving the Factor Structure of Psychological Scales: The Expanded Format as an Alternative to the Likert Scale Format

Peer reviewed

Direct link

Zhang, Xijuan; Savalei, Victoria – Educational and Psychological Measurement, 2016

Many psychological scales written in the Likert format include reverse worded (RW) items in order to control acquiescence bias. However, studies have shown that RW items often contaminate the factor structure of the scale by creating one or more method factors. The present study examines an alternative scale format, called the Expanded format,…

Descriptors: Factor Structure, Psychological Testing, Alternative Assessment, Test Items

Mixed Standard Scale Response Inconsistencies as Reliability Indices.

Peer reviewed

Benson, Philip G.; Dickinson, Terry L. – Educational and Psychological Measurement, 1983

The mixed standard scale is a rating format that allows researchers to count internally inconsistent response patterns. This study investigated the meaning of these counts, using 943 accountants as raters. The counts of internally inconsistent response patterns were not related to reliability as measured by Cronbach's alpha. (Author/BW)

Descriptors: Accountants, Adults, Error Patterns, Rating Scales

Validity and Reliability of True-False Tests.

Peer reviewed

Grosse, Martin E.; Wright, Benjamin D. – Educational and Psychological Measurement, 1985

A model of examinee behavior was used to generate hypotheses about the operation of true-false scores. Confirmation of hypotheses supported the contention that true-false scores contain an error component that makes these tests less reliable than multiple-choice tests. Examinee response style may invalidate a total true-false score. (Author/DWH)

Descriptors: Objective Tests, Response Style (Tests), Test Format, Test Reliability

Using Results on k Out of n System Reliability to Study and Characterize Tests.

Peer reviewed

Wilcox, Rand R. – Educational and Psychological Measurement, 1982

Results in the engineering literature on "k out of n system reliability" can be used to characterize tests based on estimates of the probability of correctly determining whether the examinee knows the correct response. In particular, the minimum number of distractors required for multiple-choice tests can be empirically determined.…

Descriptors: Achievement Tests, Mathematical Models, Multiple Choice Tests, Test Format

The Factor Structure of the Bem Sex-Role Inventory (BSRI): Confirmatory Analysis of Long and Short Forms.

Peer reviewed

Campbell, Todd; And Others – Educational and Psychological Measurement, 1997

The construct validity of scores from the Bem Sex-Role Inventory was studied using confirmatory factor analysis methods on data from 791 subjects. Measurement characteristics of the long and short forms were studied, with the short form yielding more reliable scores, as has previously been indicated. (Author/SLD)

Descriptors: Adults, Construct Validity, Factor Structure, Scores

A Comparison of the Item Difficulty and Item Discrimination of Multiple-Choice Items Using the "None of the Above" and One Correct Response Options.

Peer reviewed

Tollefson, Nona – Educational and Psychological Measurement, 1987

This study compared the item difficulty, item discrimination, and test reliability of three forms of multiple-choice items: (1) one correct answer; (2) "none of the above" as a foil; and (3) "none of the above" as the correct answer. Twelve items in the three formats were administered in a college statistics examination. (BS)

Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests

How Reliable Are TOEFL Scores?

Peer reviewed

Wainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997

The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)

Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning

Validity and Reliability of Tests Having Differing Numbers of Options for Students of Differing Levels of Ability.

Peer reviewed

Green, Kathy; And Others – Educational and Psychological Measurement, 1982

Achievement test reliability and validity as a function of ability were determined for multiple sections of a large undergraduate French class. Results did not support previous arguments that decreasing the number of options results in a more efficient test for high-level examinees, but less efficient for low-level examinees. (Author/GK)

Descriptors: Academic Ability, Comparative Analysis, Higher Education, Multiple Choice Tests

Reliability of Comparably Written Two-Option Multiple-Choice and True-False Test Items.

Peer reviewed

Hancock, Gregory R.; And Others – Educational and Psychological Measurement, 1993

Two-option multiple-choice vocabulary test items are compared with comparably written true-false test items. Results from a study with 111 high school students suggest that multiple-choice items provide a significantly more reliable measure than the true-false format. (SLD)

Descriptors: Ability, High School Students, High Schools, Objective Tests

A Comparison of Two, Three and Four-Choice Item Tests Given a Fixed Total Number of Choices.

Peer reviewed

Straton, Ralph G.; Catts, Ralph M. – Educational and Psychological Measurement, 1980

Multiple-choice tests composed entirely of two-, three-, or four-choice items were investigated. Results indicated that number of alternatives per item was inversely related to item difficulty, but directly related to item discrimination. Reliability and standard error of measurement of three-choice item tests was equivalent or superior.…

Descriptors: Difficulty Level, Error of Measurement, Foreign Countries, Higher Education

The Effect of Grouped versus Randomized Questionnaire Format on Scale Reliability and Validity: A Three-Study Investigation.

Peer reviewed

Schriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1989

Three studies explored the effects of grouping versus randomized items in questionnaires on internal consistency and test-retest reliability with samples of 80, 80, and 100, respectively, university students and undergraduates. The 2 correlational and 1 experimental studies were reasonably consistent in demonstrating that neither format was…

Descriptors: Classification, College Students, Evaluation Methods, Higher Education

A Construct Validity Investigation of Scores on a Japanese Version of an Academic Self-Concept Scale for Secondary School Students.

Peer reviewed

Paik, Chie; Michael, William B. – Educational and Psychological Measurement, 1999

Studied the internal consistency reliability and construct validity of scores on each of five dimensions of a Japanese version of the Dimensions of Self-Concept Scale. Results for 354 female high school students show that a five-factor oblique model accounts for the greatest proportion of covariance in the matrix of 15 subtests. Contains 20…

Descriptors: Construct Validity, Factor Structure, Females, Foreign Countries

Number of Response Categories and Statistics on a Teacher Rating Scale.

Peer reviewed

Aiken, Lewis R. – Educational and Psychological Measurement, 1983

Each of six forms of a 10-item teacher evaluation rating scale, having two to seven response categories per form, was administered to over 100 college students. Means of item responses and item variances increased with the number of response categories. Internal consistency of total scores did not change systematically. (Author/PN)

Descriptors: College Students, Higher Education, Item Analysis, Rating Scales

Previous Page | Next Page »

Pages: 1 | 2

Aiken, Lewis R.	2
Schriesheim, Chester A.	2
Trevisan, Michael S.	2
Baldauf, Richard B., Jr.	1
Benson, Philip G.	1
Bürkner, Paul-Christian	1
Campbell, Todd	1
Catts, Ralph M.	1
Dickinson, Terry L.	1
Green, Kathy	1
Grosse, Martin E.	1
Hancock, Gregory R.	1
Harley, Dwight	1
Holling, Heinz	1
Lukhele, Robert	1
Menold, Natalja	1
Michael, William B.	1
Omar, Md Hafidz	1
Paik, Chie	1
Pomplun, Mark	1
Raykov, Tenko	1
Rogers, W. Todd	1
Savalei, Victoria	1
Schulte, Niklas	1
More ▼