ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	8

Descriptor

Difficulty Level	12
Goodness of Fit	12
Test Format	12
Test Items	10
Item Analysis	5
Item Response Theory	5
Models	4
Foreign Countries	3
Mathematical Models	3
Scores	3
Statistical Analysis	3
Test Construction	3
College Entrance Examinations	2
Comparative Analysis	2
Elementary Secondary Education	2
English (Second Language)	2
Equated Scores	2
Individual Differences	2
Language Tests	2
Listening Comprehension Tests	2
Mathematics Tests	2
Multiple Choice Tests	2
Raw Scores	2
Reading Tests	2
Scaling	2
More ▼

Source

Applied Psychological…	1
Behavioral Research and…	1
Chemical Engineering Education	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1
Journal of Psychoeducational…	1
Pearson	1
SAGE Open	1

Publication Type

Reports - Research	8
Journal Articles	7
Reports - Evaluative	4
Speeches/Meeting Papers	4
Numerical/Quantitative Data	1

Education Level

Elementary Secondary Education	3
Grade 8	2
Higher Education	2
Postsecondary Education	2
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Iran	2
Belgium	1
China	1
Malaysia	1
Oregon	1
Philippines	1
Singapore	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Investigating Learning and Improving Teaching in Engineering Thermodynamics Guided by Constructive Alignment and Competency Modeling: Part II. Assessment & Exam Design

Peer reviewed

Direct link

Braun, Thorsten; Stierle, Rolf; Fischer, Matthias; Gross, Joachim – Chemical Engineering Education, 2023

Contributing to a competency model for engineering thermodynamics, we investigate the empirical competency structure of our exams in an attempt to answer the question: Do we test the competencies we want to convey to our students? We demonstrate that thermodynamic modeling and mathematical solution emerge as significant dimensions of thermodynamic…

Descriptors: Thermodynamics, Consciousness Raising, Engineering Education, Test Format

Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit

Peer reviewed

Direct link

Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022

The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…

Descriptors: Scoring, Test Items, Test Format, Raw Scores

Method Bias in Cloze Tests as Reading Comprehension Measures

Peer reviewed

Direct link

Baghaei, Purya; Ravand, Hamdollah – SAGE Open, 2019

In many reading comprehension tests, different test formats are employed. Two commonly used test formats to measure reading comprehension are sustained passages followed by some questions and cloze items. Individual differences in handling test format peculiarities could constitute a source of score variance. In this study, a bifactor Rasch model…

Descriptors: Cloze Procedure, Test Bias, Individual Differences, Difficulty Level

Modeling Local Item Dependence Due to Common Test Format with a Multidimensional Rasch Model

Peer reviewed

Direct link

Baghaei, Purya; Aryadoust, Vahid – International Journal of Testing, 2015

Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…

Descriptors: Test Format, Item Response Theory, Models, Test Items

Examining the Teachers' Sense of Efficacy Scale at the Item Level with Rasch Measurement Model

Peer reviewed

Direct link

Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016

The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…

Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory

Modeling Item-Position Effects within an IRT Framework

Peer reviewed

Direct link

Debeer, Dries; Janssen, Rianne – Journal of Educational Measurement, 2013

Changing the order of items between alternate test forms to prevent copying and to enhance test security is a common practice in achievement testing. However, these changes in item order may affect item and test characteristics. Several procedures have been proposed for studying these item-order effects. The present study explores the use of…

Descriptors: Item Response Theory, Test Items, Test Format, Models

Population Invariance of Vertical Scaling Results

Direct link

Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012

The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…

Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests

Examining Item Functioning of Math Screening Measures for Grades 1-8 Students. Technical Report Number 08-04

Download full text

Liu, Kimy; Ketterlin-Geller, Leanne R.; Yovanoff, Paul; Tindal, Gerald – Behavioral Research and Teaching, 2008

BRT Math Screening Measures focus on students' mathematics performance in grade-level standards for students in grades 1-8. A total of 24 test forms are available with three test forms per grade corresponding to fall, winter, and spring testing periods. Each form contains computation problems and application problems. BRT Math Screening Measures…

Descriptors: Test Items, Test Format, Test Construction, Item Response Theory

Severity of Grading across Time Periods.

Download full text

Lunz, Mary E.; Stahl, John A. – 1990

Three examinations administered to medical students were analyzed to determine differences among severities of judges' assessments and among grading periods. The examinations included essay, clinical, and oral forms of the tests. Twelve judges graded the three essays for 32 examinees during a 4-day grading session, which was divided into eight…

Descriptors: Clinical Diagnosis, Comparative Testing, Difficulty Level, Essay Tests

A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

Peer reviewed

Direct link

Yao, Lihua; Schwarz, Richard D. – Applied Psychological Measurement, 2006

Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…

Descriptors: Models, Item Response Theory, Markov Processes, Monte Carlo Methods

The Effect of Item Presentation on Item Difficulty and Discrimination in Language-Usage Tests.

Huntley, Renee M.; Carlson, James E. – 1986

This study compared student performance on language-usage test items presented in two different formats: as discrete sentences and as items embedded in passages. American College Testing (ACT) Program's Assessment experimental units were constructed that presented 40 items in the two different formats. Results suggest item presentation may not…

Descriptors: College Entrance Examinations, Difficulty Level, Goodness of Fit, Item Analysis

Practical Questions about Item Response Models in Large-Scale Assessment Programs.

Download full text

Legg, Sue M.; Algina, James – 1986

This paper focuses on the questions which arise as test practitioners monitor score scales derived from latent trait theory. Large scale assessment programs are dynamic and constantly challenge the assumptions and limits of latent trait models. Even though testing programs evolve, test scores must remain reliable indicators of progress.…

Descriptors: Difficulty Level, Educational Assessment, Elementary Secondary Education, Equated Scores

Baghaei, Purya	2
Algina, James	1
Aryadoust, Vahid	1
Betts, Joe	1
Binici, Salih	1
Braun, Thorsten	1
Carlson, James E.	1
Chang, Mei-Lin	1
Debeer, Dries	1
Engelhard, George, Jr.	1
Fischer, Matthias	1
Gross, Joachim	1
Huntley, Renee M.	1
Janssen, Rianne	1
Kao, Shu-chuan	1
Ketterlin-Geller, Leanne R.	1
Kim, Doyoung	1
Legg, Sue M.	1
Liu, Kimy	1
Lunz, Mary E.	1
Muntean, William	1
Powers, Sonya	1
Ravand, Hamdollah	1
Schwarz, Richard D.	1
Stahl, John A.	1
More ▼