ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	6

Descriptor

Difficulty Level	13
Test Format	13
Test Items	12
Multiple Choice Tests	6
Higher Education	4
Item Response Theory	4
Test Construction	4
Foreign Countries	3
Item Analysis	3
Test Reliability	3
Adaptive Testing	2
Computer Assisted Testing	2
Equated Scores	2
Licensing Examinations…	2
Responses	2
Accuracy	1
Change	1
College Entrance Examinations	1
College Students	1
Comparative Testing	1
Computer Software	1
Correlation	1
Data Interpretation	1
Differences	1
Distractors (Tests)	1
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	13
Reports - Research	9
Reports - Evaluative	4
Information Analyses	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

High Schools	1
Secondary Education	1

Audience

Location

Australia	1
Germany	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

Assessments and Surveys

Raven Progressive Matrices

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Evaluating Equating Methods for Varying Levels of Form Difference

Peer reviewed

Direct link

Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…

Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students

On the Relationship between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022

The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…

Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level

Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit

Peer reviewed

Direct link

Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022

The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…

Descriptors: Scoring, Test Items, Test Format, Raw Scores

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Applying Item Response Theory Methods to Examine the Impact of Different Response Formats

Peer reviewed

Direct link

Hohensinn, Christine; Kubinger, Klaus D. – Educational and Psychological Measurement, 2011

In aptitude and achievement tests, different response formats are usually used. A fundamental distinction must be made between the class of multiple-choice formats and the constructed response formats. Previous studies have examined the impact of different response formats applying traditional statistical approaches, but these influences can also…

Descriptors: Item Response Theory, Multiple Choice Tests, Responses, Test Format

Applications of the Linear Logistic Test Model in Psychometric Research

Peer reviewed

Direct link

Kubinger, Klaus D. – Educational and Psychological Measurement, 2009

The linear logistic test model (LLTM) breaks down the item parameter of the Rasch model as a linear combination of some hypothesized elementary parameters. Although the original purpose of applying the LLTM was primarily to generate test items with specified item difficulty, there are still many other potential applications, which may be of use…

Descriptors: Models, Test Items, Psychometrics, Item Response Theory

A Comparison of the Item Difficulty and Item Discrimination of Multiple-Choice Items Using the "None of the Above" and One Correct Response Options.

Peer reviewed

Tollefson, Nona – Educational and Psychological Measurement, 1987

This study compared the item difficulty, item discrimination, and test reliability of three forms of multiple-choice items: (1) one correct answer; (2) "none of the above" as a foil; and (3) "none of the above" as the correct answer. Twelve items in the three formats were administered in a college statistics examination. (BS)

Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests

A Meta-Analytic Review of Item Discrimination and Difficulty in Multiple-Choice Items Using "None-of-the-Above."

Peer reviewed

Knowles, Susan L.; Welch, Cynthia A. – Educational and Psychological Measurement, 1992

A meta-analysis of the difficulty and discrimination of the "none-of-the-above" (NOTA) test option was conducted with 12 articles (20 effect sizes) for difficulty and 7 studies (11 effect sizes) for discrimination. Findings indicate that using the NOTA option does not result in items of lesser quality. (SLD)

Descriptors: Difficulty Level, Effect Size, Meta Analysis, Multiple Choice Tests

A Comparison of Two, Three and Four-Choice Item Tests Given a Fixed Total Number of Choices.

Peer reviewed

Straton, Ralph G.; Catts, Ralph M. – Educational and Psychological Measurement, 1980

Multiple-choice tests composed entirely of two-, three-, or four-choice items were investigated. Results indicated that number of alternatives per item was inversely related to item difficulty, but directly related to item discrimination. Reliability and standard error of measurement of three-choice item tests was equivalent or superior.…

Descriptors: Difficulty Level, Error of Measurement, Foreign Countries, Higher Education

The Effect of Altering the Position of Options in a Multiple-Choice Examination.

Peer reviewed

Cizek, Gregory J. – Educational and Psychological Measurement, 1994

Performance of a common set of test items on an examination in which the order of options for one test form was experimentally manipulated. Results for 759 medical specialty board examinees find that reordering item options results in significant but unpredictable effects on item difficulty. (SLD)

Descriptors: Change, Difficulty Level, Equated Scores, Licensing Examinations (Professions)

Use of an Inclusive Option and the Optimal Number of Options for Multiple-Choice Items.

Peer reviewed

Crehan, Kevin D.; And Others – Educational and Psychological Measurement, 1993

Studies with 220 college students found that multiple-choice test items with 3 items are more difficult than those with 4 items, and items with the none-of-these option are more difficult than those without this option. Neither format manipulation affected item discrimination. Implications for test construction are discussed. (SLD)

Descriptors: College Students, Comparative Testing, Difficulty Level, Distractors (Tests)

Linking the Standard and Advanced Forms of the Raven's Progressive Matrices in both the Pencil-and-Paper and Computer-Adaptive-Testing Formats.

Peer reviewed

Styles, Irene; Andrich, David – Educational and Psychological Measurement, 1993

This paper describes the use of the Rasch model to help implement computerized administration of the standard and advanced forms of Raven's Progressive Matrices (RPM), to compare relative item difficulties, and to convert scores between the standard and advanced forms. The sample consisted of 95 girls and 95 boys in Australia. (SLD)

Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Elementary Education

Analyzing Optional Test Items.

Peer reviewed

Aiken, Lewis R. – Educational and Psychological Measurement, 1989

Two alternatives to traditional item analysis and reliability estimation procedures are considered for determining the difficulty, discrimination, and reliability of optional items on essay and other tests. A computer program to compute these measures is described, and illustrations are given. (SLD)

Descriptors: College Entrance Examinations, Computer Software, Difficulty Level, Essay Tests

Kubinger, Klaus D.	2
Aiken, Lewis R.	1
Andrich, David	1
Betts, Joe	1
Catts, Ralph M.	1
Cizek, Gregory J.	1
Crehan, Kevin D.	1
DeMars, Christine E.	1
Hohensinn, Christine	1
Kao, Shu-chuan	1
Kim, Doyoung	1
Knowles, Susan L.	1
Menold, Natalja	1
Muntean, William	1
Raykov, Tenko	1
Socha, Alan	1
Stella Yun Kim	1
Straton, Ralph G.	1
Styles, Irene	1
Ting Sun	1
Tollefson, Nona	1
Welch, Cynthia A.	1
More ▼