ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	3

Descriptor

Interrater Reliability	4
Test Format	4
Computer Assisted Testing	2
Measurement Techniques	2
Scoring	2
Artificial Intelligence	1
Automation	1
Best Practices	1
Business Administration…	1
Cognitive Structures	1
Computer Software	1
Concept Mapping	1
Construct Validity	1
Correlation	1
Data Analysis	1
Educational Technology	1
Error of Measurement	1
Essay Tests	1
Evaluation Methods	1
Evaluation Research	1
Higher Education	1
Item Response Theory	1
Item Sampling	1
Judges	1
Models	1
More ▼

Source

ETS Research Report Series	1
Educational and Psychological…	1
Journal of Educational…	1

Author

Casabianca, Jodi M.	1
Chang, Lei	1
Clariana, Roy B.	1
Lawless, René R.	1
McCaffrey, Daniel F.	1
Ricker-Pedley, Kathryn L.	1
Schumacker, Randall E.	1
Smith, Everett V., Jr.	1
Vos, Hans J.	1
Wallace, Patricia	1
Wendler, Cathy	1
van der Linden, Wim J.	1
More ▼

Publication Type

Reports - Descriptive	4
Journal Articles	3

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Best Practices for Constructed-Response Scoring. Research Report. ETS RR-22-17

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022

This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…

Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing

A Rasch Perspective

Peer reviewed

Direct link

Schumacker, Randall E.; Smith, Everett V., Jr. – Educational and Psychological Measurement, 2007

Measurement error is a common theme in classical measurement models used in testing and assessment. In classical measurement models, the definition of measurement error and the subsequent reliability coefficients differ on the basis of the test administration design. Internal consistency reliability specifies error due primarily to poor item…

Descriptors: Measurement Techniques, Error of Measurement, Item Sampling, Item Response Theory

Detecting Intrajudge Inconsistency in Standard Setting Using Test Items with a Selected-Response Format. Research Report.

Download full text

van der Linden, Wim J.; Vos, Hans J.; Chang, Lei – 2000

In judgmental standard setting experiments, it may be difficult to specify subjective probabilities that adequately take the properties of the items into account. As a result, these probabilities are not consistent with each other in the sense that they do not refer to the same borderline level of performance. Methods to check standard setting…

Descriptors: Interrater Reliability, Judges, Probability, Standard Setting

A Computer-Based Approach for Deriving and Measuring Individual and Team Knowledge Structure from Essay Questions

Peer reviewed

Direct link

Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007

This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures