NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025
Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…
Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading
Peer reviewed Peer reviewed
Direct linkDirect link
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C. – Early Childhood Research Quarterly, 2010
This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children's interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability,…
Descriptors: Construct Validity, Validity, Interrater Reliability, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Murley, Lisa D.; Stobaugh, Rebecca; Jukes, Pamela; Tassell, Janet – Educational Renaissance, 2014
The purpose of this article is to provide an overview of the process used to examine the inter-rater reliability of the Teacher Work Sample (TWS) Scoring Rubric involved with the senior culminating experience for teacher candidates used at a large comprehensive university. The study compared holistic and analytic scores reported by Student Teacher…
Descriptors: Teacher Education, Interrater Reliability, Scoring Rubrics, Preservice Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Storch, Eric A.; Rasmussen, Steven A.; Price, Lawrence H.; Larson, Michael J.; Murphy, Tanya K.; Goodman, Wayne K. – Psychological Assessment, 2010
The Yale-Brown Obsessive-Compulsive Scale (Y-BOCS; Goodman, Price, Rasmussen, Mazure, Delgado, et al., 1989) is acknowledged as the gold standard measure of obsessive-compulsive disorder (OCD) symptom severity. A number of areas where the Y-BOCS may benefit from revision have emerged in past psychometric studies of the Severity Scale and Symptom…
Descriptors: Check Lists, Construct Validity, Validity, Measures (Individuals)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Katz, Irvin R.; Elliot, Norbert; Attali, Yigal; Scharf, Davida; Powers, Donald; Huey, Heather; Joshi, Kamal; Briller, Vladimir – ETS Research Report Series, 2008
This study presents an investigation of information literacy as defined by the ETS iSkills™ assessment and by the New Jersey Institute of Technology (NJIT) Information Literacy Scale (ILS). As two related but distinct measures, both iSkills and the ILS were used with undergraduate students at NJIT during the spring 2006 semester. Undergraduate…
Descriptors: Information Literacy, Information Skills, Skill Analysis, Case Studies
Boldt, Robert F.; Oltman, Philip K. – 1993
Administration of the Test of Spoken English (TSE) yields tapes of oral performance on items within six sections of the test. Trained scorers subsequently rate responses using four proficiency scales: pronunciation, grammar, fluency, and overall comprehensibility. This project examined the consistency of statistical relations among TSE scores with…
Descriptors: Audiotape Recordings, Construct Validity, Correlation, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007
This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures