ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	10

Descriptor

Test Format	20
Test Items	20
Validity	20
Test Construction	9
Multiple Choice Tests	7
Comparative Analysis	4
Correlation	4
Foreign Countries	4
Models	4
Difficulty Level	3
Item Response Theory	3
Second Language Learning	3
Testing Accommodations	3
Construct Validity	2
English (Second Language)	2
Evaluation Methods	2
Evidence	2
Factor Analysis	2
Gender Differences	2
Grade 8	2
Higher Education	2
Language Tests	2
Psychometrics	2
Responses	2
Test Bias	2
More ▼

Source

Applied Measurement in…	3
Applied Psychological…	1
Educational Evaluation and…	1
European Educational Research…	1
Evaluation & Research in…	1
Intelligence	1
International Journal of…	1
Journal of Experimental…	1
Journal of Psychoeducational…	1
Journal of Research in…	1
Journal of Special Education	1
Language Testing	1
Review of Research in…	1
Structural Equation Modeling	1
Teaching of Psychology	1
More ▼

Publication Type

Journal Articles	17
Reports - Research	14
Reports - Evaluative	4
Speeches/Meeting Papers	2
Books	1
Guides - Non-Classroom	1
Information Analyses	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	2
Grade 8	2
Higher Education	2
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	1
Students	1
Teachers	1

Location

United Kingdom	2
China	1
Europe	1
Germany	1
United States	1

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Validation of Automated Scoring of Science Assessments

Peer reviewed

Direct link

Liu, Ou Lydia; Rios, Joseph A.; Heilman, Michael; Gerard, Libby; Linn, Marcia C. – Journal of Research in Science Teaching, 2016

Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine…

Descriptors: Science Tests, Scoring, Automation, Validity

A Review on Diagnostic Vocabulary Tests

Peer reviewed
PDF on ERIC

Download full text

He, Chunxiu – International Journal of Higher Education, 2019

Given the practical significance of vocabulary testing in language teaching and the theoretical foundations of developing a vocabulary test, four well-established vocabulary tests are introduced for diagnostic purpose together with their corresponding validation studies, with a focus on the designed purpose, the selection of the items, the…

Descriptors: Vocabulary Development, Language Tests, Validity, Test Format

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Assessment of Computer and Information Literacy in ICILS 2013: Do Different Item Types Measure the Same Construct?

Peer reviewed

Direct link

Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia – European Educational Research Journal, 2017

The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…

Descriptors: Foreign Countries, Computer Literacy, Information Literacy, International Assessment

Multiple-Choice and Short-Answer Exam Performance in a College Classroom

Peer reviewed

Direct link

Funk, Steven C.; Dickson, K. Laurie – Teaching of Psychology, 2011

The authors experimentally investigated the effects of multiple-choice and short-answer format exam items on exam performance in a college classroom. They randomly assigned 50 students to take a 10-item short-answer pretest or posttest on two 50-item multiple-choice exams in an introduction to personality course. Students performed significantly…

Descriptors: Test Items, Test Format, Multiple Choice Tests, Validity

Examining the Effectiveness of Test Accommodation Using DIF and a Mixture IRT Model

Peer reviewed

Direct link

Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012

This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…

Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity

Accommodations and Item-Level Analyses Using Mixture Differential Item Functioning Models

Peer reviewed

Direct link

Scarpati, Stanley E.; Wells, Craig S.; Lewis, Christine; Jirka, Stephen – Journal of Special Education, 2011

The purpose of this study was to use differential item functioning (DIF) and latent mixture model analyses to explore factors that explain performance differences on a large-scale mathematics assessment between examinees allowed to use a calculator or who were afforded item presentation accommodations versus those who did not receive the same…

Descriptors: Testing Accommodations, Test Items, Test Format, Validity

Distinguishing Verbal, Quantitative, and Figural Facets of Fluid Intelligence in Young Students

Peer reviewed

Direct link

Lakin, Joni M.; Gambrell, James L. – Intelligence, 2012

Measures of broad fluid abilities including verbal, quantitative, and figural reasoning are commonly used in the K-12 school context for a variety of purposes. However, differentiation of these domains is difficult for young children (grades K-2) who lack basic linguistic and mathematical literacy. This study examined the latent factor structure…

Descriptors: Evidence, Validity, Item Response Theory, Numeracy

An Examination of the Validity of Positive and Negative Items on a Single-Scale Instrument.

Peer reviewed

Schott, G. R.; Bellin, W. – Evaluation & Research in Education, 2001

Developed an approach to account for the impact of item presentation on ensuing constructs in the development of two versions of a self-report measure, the Relational Concept Scale, that was tested with 978 adolescent students in the United Kingdom. Outlines benefits of developing two versions of the scale to protect against presentational bias.…

Descriptors: Adolescents, Foreign Countries, Statistical Bias, Test Construction

Validity of a Taxonomy of Multiple-Choice Item-Writing Rules.

Peer reviewed

Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989

Results of 96 theoretical/empirical studies were reviewed to see if they support a taxonomy of 43 rules for writing multiple-choice test items. The taxonomy is the result of an analysis of 46 textbooks dealing with multiple-choice item writing. For nearly half of the rules, no research was found. (SLD)

Descriptors: Classification, Literature Reviews, Multiple Choice Tests, Test Construction

How Well Do Item Parcels Represent Conceptually Defined Latent Constructs? A Two-Facet Approach

Peer reviewed

Direct link

Hagtvet, Knut A.; Nasser, Fadia M. – Structural Equation Modeling, 2004

This article presents a methodology for examining the content and nature of item parcels as indicators of a conceptually defined latent construct. An essential component of this methodology is the 2-facet measurement model, which includes items and parcels as facets of construct indicators. The 2-facet model tests assumptions required for…

Descriptors: Evaluation Methods, Validity, Test Anxiety, Content Validity

Developing and Validating Multiple-Choice Test Items. Second Edition.

Haladyna, Thomas M. – 1999

This book explains writing effective multiple-choice test items and studying responses to items to evaluate and improve them, two topics that are very important in the development of many cognitive tests. The chapters are: (1) "Providing a Context for Multiple-Choice Testing"; (2) "Constructed-Response and Multiple-Choice Item Formats"; (3)…

Descriptors: Constructed Response, Multiple Choice Tests, Test Construction, Test Format

The Validity of Two Item-Writing Rules.

Peer reviewed

Crehan, Kevin; Haladyna, Thomas M. – Journal of Experimental Education, 1991

Two item-writing rules were tested: phrasing stems as questions versus partial sentences; and using the "none-of-the-above" option instead of a specific content option. Results with 228 college students do not support the use of either stem type and provide limited evidence to caution against the "none-of-the-above" option.…

Descriptors: College Students, Higher Education, Multiple Choice Tests, Test Construction

Adapting Tests for Use in Multiple Languages and Cultures. Laboratory of Psychometric and Evaluative Research Report.

Download full text

Hambleton, Ronald K.; Patsula, Liane – 2000

Whatever the purpose of test adaptation, questions arise concerning the validity of inferences from such adapted tests. This paper considers several advantages and disadvantages of adapting tests from one language and culture to another. The paper also reviews several sources of error or invalidity associated with adapting tests and suggests ways…

Descriptors: Cross Cultural Studies, Cultural Awareness, Quality of Life, Test Construction

Previous Page | Next Page »

Pages: 1 | 2

Haladyna, Thomas M.	3
Bellin, W.	1
Carney, Michele	1
Cavey, Laurie	1
Chang, Lei	1
Cho, Hyun-Jeong	1
Crehan, Kevin	1
Dickson, K. Laurie	1
Downing, Steven M.	1
Dudley, Albert	1
Floyd, Harlee S.	1
Funk, Steven C.	1
Gambrell, James L.	1
Gerard, Libby	1
Gerick, Julia	1
Goldhammer, Frank	1
Hagtvet, Knut A.	1
Hambleton, Ronald K.	1
Hamilton, Laura S.	1
He, Chunxiu	1
Heilman, Michael	1
Ihme, Jan Marten	1
Jirka, Stephen	1
Kingston, Neal	1
Lakin, Joni M.	1
More ▼