ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	3

Descriptor

Classification	14
Test Format	14
Multiple Choice Tests	5
Test Construction	5
Test Items	5
Item Response Theory	4
Language Tests	4
Comparative Analysis	3
English (Second Language)	3
Test Validity	3
Difficulty Level	2
Knowledge Level	2
Pass Fail Grading	2
Scores	2
Student Evaluation	2
Test Use	2
Vocabulary	2
Vocabulary Development	2
Ability	1
Academic Achievement	1
Accuracy	1
Achievement Tests	1
Alternative Assessment	1
Biology	1
Check Lists	1
More ▼

Source

Applied Measurement in…	2
Canadian Modern Language…	1
Educational and Psychological…	1
Journal of Educational…	1
LUMAT: International Journal…	1

Publication Type

Reports - Evaluative	14
Journal Articles	6
Speeches/Meeting Papers	6
Information Analyses	3
Reports - Research	2
Numerical/Quantitative Data	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Finland

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Armed Services Vocational…	1
National Assessment of…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

The Finnish Matriculation Examination in Biology from 1921 to 1969 -- Trends in Knowledge Content and Educational Form

Peer reviewed
PDF on ERIC

Download full text

Neiro, Jakke; Johansson, Niko – LUMAT: International Journal on Math, Science and Technology Education, 2020

The history and evolution of science assessment remains poorly known, especially in the context of the exam question contents. Here we analyze the Finnish matriculation examination in biology from the 1920s to 1960s to understand how the exam has evolved in both its knowledge content and educational form. Each question was classified according to…

Descriptors: Foreign Countries, Biology, Test Content, Test Format

Leniency Effects on Convergent and Discriminant Validity for Grouped Questionnaire Items: A Further Investigation.

Peer reviewed

Schriesheim, Chester A. – Educational and Psychological Measurement, 1981

This study provides support for the hypothesized effect of leniency on the discriminant validity of grouped questionnaire items. It was found that controlling for leniency resulted in a slight decrement in convergent validity but that discriminant validity was substantially improved. Implications for questionnaire validity and further research are…

Descriptors: Classification, Correlation, Questionnaires, Research Problems

Validity of a Taxonomy of Multiple-Choice Item-Writing Rules.

Peer reviewed

Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989

Results of 96 theoretical/empirical studies were reviewed to see if they support a taxonomy of 43 rules for writing multiple-choice test items. The taxonomy is the result of an analysis of 46 textbooks dealing with multiple-choice item writing. For nearly half of the rules, no research was found. (SLD)

Descriptors: Classification, Literature Reviews, Multiple Choice Tests, Test Construction

The Classification Accuracy of Shortened versus Full Length Tests with Number Correct Scoring.

Download full text

Schulz, E. Matthew; Wang, Lin – 2001

In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…

Descriptors: Ability, Classification, Item Response Theory, Pass Fail Grading

Partial Lexical Knowledge in Tests of Incidental Vocabulary Learning from L2 Reading

Peer reviewed

Direct link

Bruton, Anthony – Canadian Modern Language Review, 2007

This analysis evaluates the receptive tests of targeted lexical knowledge in the written medium, which are typically used in empirical research into lexical acquisition from reading foreign/second language texts. Apart from the types of second language cues or prompts, and the language of the responses, the main issues revolve around: (a) the…

Descriptors: Knowledge Level, Form Classes (Languages), Second Language Learning, Vocabulary Development

A Taxonomy of Multiple-Choice Item-Writing Rules.

Peer reviewed

Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989

A taxonomy of 43 rules for writing multiple-choice test items is presented, based on a consensus of 46 textbooks. These guidelines are presented as complete and authoritative, with solid consensus apparent for 33 of the rules. Four rules lack consensus, and 5 rules were cited fewer than 10 times. (SLD)

Descriptors: Classification, Interrater Reliability, Multiple Choice Tests, Objective Tests

An Analysis of Factors Affecting the Difficulty of Dialogue Items in TOEFL Listening Comprehension. TOEFL Research Reports, 51.

Download full text

Nissan, Susan; And Others – 1996

One of the item types in the Listening Comprehension section of the Test of English as a Foreign Language (TOEFL) test is the dialogue. Because the dialogue item pool needs to have an appropriate balance of items at a range of difficulty levels, test developers have examined items at various difficulty levels in an attempt to identify their…

Descriptors: Classification, Dialogs (Language), Difficulty Level, English (Second Language)

Assessing the Impact of Multidimensionality on the Classification Decisions of an IRT-Based Licensure Examination.

Download full text

Sykes, Robert C.; And Others – 1992

A part-form methodology was used to study the effect of varying degrees of multidimensionality on the consistency of pass/fail classification decisions obtained from simulated unidimensional item response theory (IRT) based licensure examinations. A control on the degree of form multidimensionality permitted an assessment throughout the range of…

Descriptors: Classification, Comparative Testing, Computer Simulation, Decision Making

Test Form Accuracy.

Download full text

Wise, Lauress – 1993

As high-stakes use of tests increases, it becomes vital that test developers and test users communicate clearly about the accuracy and limitations of the scores generated by a test after it is assembled and used. A procedure is described for portraying the accuracy of test scores. It can be used in setting accuracy targets during form construction…

Descriptors: Classification, High Stakes Tests, Item Response Theory, Military Personnel

Booklet Classification Study.

Download full text

Hanson, Bradley A.; Bay, Luz; Loomis, Susan Cooper – 1998

Research studies using booklet classification were implemented by the American College Testing Program to investigate the linkage between the National Assessment of Educational Progress (NAEP) Achievement Levels Descriptions and the cutpoints set to represent student performance with respect to the achievement levels. This paper describes the…

Descriptors: Academic Achievement, Classification, Cutting Scores, Discriminant Analysis

IDEA Oral Language Proficiency Test (IPT II).

Download full text

Stansfield, Charles W. – 1990

The IDEA Oral Language Proficiency Test (IPT II), an individually-administered measure of speaking and listening proficiency in English as a Second Language designed for secondary school students, is described and discussed. The test consists of 91 items and requires 5-25 minutes to administer. Raw scores are converted to one of seven proficiency…

Descriptors: Classification, English (Second Language), Language Proficiency, Language Tests

Toward an Operational Definition of Educational Performance Assessments.

Download full text

Finch, F. L.; Dost, Marcia A. – 1992

Many state and local entities are developing and using performance assessment programs. Because these initiatives are so diverse, it is very difficult to understand what they are doing, or to compare them in any meaningful way. Multiple-choice tests are contrasted with performance assessments, and preliminary classifications are suggested to…

Descriptors: Alternative Assessment, Classification, Comparative Analysis, Constructed Response

Some Issues in the Testing of Vocabulary Knowledge.

Download full text

Read, John; Nation, Paul – 1986

A review of the literature on a variety of issues related to testing vocabulary knowledge in a second language addresses these topics: problems in estimating vocabulary size, including the related questions of what constitutes a word, how a sample should be selected, and what are the criteria for knowing a word; sampling the basic and specialized…

Descriptors: Achievement Tests, Check Lists, Classification, Comparative Analysis

Downing, Steven M.	2
Haladyna, Thomas M.	2
Bay, Luz	1
Bruton, Anthony	1
Choi, Jiwon	1
Dost, Marcia A.	1
Finch, F. L.	1
Hanson, Bradley A.	1
Johansson, Niko	1
Kang, Yujin	1
Kim, Stella Y.	1
Lee, Won-Chan	1
Loomis, Susan Cooper	1
Nation, Paul	1
Neiro, Jakke	1
Nissan, Susan	1
Read, John	1
Schriesheim, Chester A.	1
Schulz, E. Matthew	1
Stansfield, Charles W.	1
Sykes, Robert C.	1
Wang, Lin	1
Wise, Lauress	1
More ▼