ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	7

Descriptor

Construct Validity	8
Interrater Reliability	8
Scoring	8
Correlation	4
Scores	3
Computer Assisted Testing	2
Data Analysis	2
English (Second Language)	2
Essay Tests	2
Evaluation Methods	2
Evaluators	2
Foreign Countries	2
Performance Based Assessment	2
Psychometrics	2
Test Bias	2
Undergraduate Students	2
Validity	2
Achievement Gains	1
Anxiety	1
Artificial Intelligence	1
Assessment Literacy	1
Audiotape Recordings	1
Basic Skills	1
Behavior Disorders	1
Business Administration…	1
More ▼

Source

ETS Research Report Series	1
Early Childhood Research…	1
Educational Process:…	1
Educational Renaissance	1
Journal of Educational…	1
Psychological Assessment	1
SAGE Open	1

Publication Type

Journal Articles	7
Reports - Research	5
Tests/Questionnaires	3
Reports - Descriptive	2
Reports - Evaluative	1

Education Level

Higher Education	4
Postsecondary Education	4
Early Childhood Education	2
Elementary Education	1
Preschool Education	1
Secondary Education	1

Audience

Location

Greece	1
Iran	1
Kentucky	1
New Jersey	1

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Can AI Grade Like a Human? Validity, Reliability, and Fairness in University Coursework Assessment

Peer reviewed
PDF on ERIC

Download full text

Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025

Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…

Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

The Individualized Classroom Assessment Scoring System (inCLASS): Preliminary Reliability and Validity of a System for Observing Preschoolers' Competence in Classroom Interactions

Peer reviewed

Direct link

Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C. – Early Childhood Research Quarterly, 2010

This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children's interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability,…

Descriptors: Construct Validity, Validity, Interrater Reliability, Scoring

Examining the Reliability of a Culminating Teacher Education Assessment and Discovering Areas for Reform

Peer reviewed
PDF on ERIC

Download full text

Murley, Lisa D.; Stobaugh, Rebecca; Jukes, Pamela; Tassell, Janet – Educational Renaissance, 2014

The purpose of this article is to provide an overview of the process used to examine the inter-rater reliability of the Teacher Work Sample (TWS) Scoring Rubric involved with the senior culminating experience for teacher candidates used at a large comprehensive university. The study compared holistic and analytic scores reported by Student Teacher…

Descriptors: Teacher Education, Interrater Reliability, Scoring Rubrics, Preservice Teachers

Development and Psychometric Evaluation of the Yale-Brown Obsessive-Compulsive Scale--Second Edition

Peer reviewed

Direct link

Storch, Eric A.; Rasmussen, Steven A.; Price, Lawrence H.; Larson, Michael J.; Murphy, Tanya K.; Goodman, Wayne K. – Psychological Assessment, 2010

The Yale-Brown Obsessive-Compulsive Scale (Y-BOCS; Goodman, Price, Rasmussen, Mazure, Delgado, et al., 1989) is acknowledged as the gold standard measure of obsessive-compulsive disorder (OCD) symptom severity. A number of areas where the Y-BOCS may benefit from revision have emerged in past psychometric studies of the Severity Scale and Symptom…

Descriptors: Check Lists, Construct Validity, Validity, Measures (Individuals)

The Assessment of Information Literacy: A Case Study. Research Report. ETS RR-08-33

Peer reviewed
PDF on ERIC

Download full text

Katz, Irvin R.; Elliot, Norbert; Attali, Yigal; Scharf, Davida; Powers, Donald; Huey, Heather; Joshi, Kamal; Briller, Vladimir – ETS Research Report Series, 2008

This study presents an investigation of information literacy as defined by the ETS iSkills™ assessment and by the New Jersey Institute of Technology (NJIT) Information Literacy Scale (ILS). As two related but distinct measures, both iSkills and the ILS were used with undergraduate students at NJIT during the spring 2006 semester. Undergraduate…

Descriptors: Information Literacy, Information Skills, Skill Analysis, Case Studies

Multimethod Construct Validation of the Test of Spoken English. Report 46.

Download full text

Boldt, Robert F.; Oltman, Philip K. – 1993

Administration of the Test of Spoken English (TSE) yields tapes of oral performance on items within six sections of the test. Trained scorers subsequently rate responses using four proficiency scales: pronunciation, grammar, fluency, and overall comprehensibility. This project examined the consistency of statistical relations among TSE scores with…

Descriptors: Audiotape Recordings, Construct Validity, Correlation, English (Second Language)

A Computer-Based Approach for Deriving and Measuring Individual and Team Knowledge Structure from Essay Questions

Peer reviewed

Direct link

Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007

This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures

Ahmadi Shirazi, Masoumeh	1
Attali, Yigal	1
Boldt, Robert F.	1
Booren, Leslie M.	1
Briller, Vladimir	1
Clariana, Roy B.	1
Downer, Jason T.	1
Elliot, Norbert	1
Georgios Zacharis	1
Goodman, Wayne K.	1
Huey, Heather	1
Joshi, Kamal	1
Jukes, Pamela	1
Katz, Irvin R.	1
Larson, Michael J.	1
Lima, Olivia K.	1
Luckner, Amy E.	1
Murley, Lisa D.	1
Murphy, Tanya K.	1
Oltman, Philip K.	1
Pianta, Robert C.	1
Powers, Donald	1
Price, Lawrence H.	1
Rasmussen, Steven A.	1
Scharf, Davida	1
More ▼