ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	14

Descriptor

College Students	23
Interrater Reliability	23
Test Reliability	23
Test Validity	12
Foreign Countries	9
Second Language Learning	7
Evaluation Methods	6
Computer Assisted Testing	5
English (Second Language)	5
Correlation	4
Higher Education	4
Psychometrics	4
Second Language Instruction	4
Student Evaluation	4
Writing Tests	4
Comparative Testing	3
Construct Validity	3
Language Tests	3
Peer Evaluation	3
Scores	3
Scoring	3
Statistical Analysis	3
Student Attitudes	3
Academic Achievement	2
Artificial Intelligence	2
More ▼

Publication Type

Journal Articles	19
Reports - Research	15
Reports - Evaluative	4
Tests/Questionnaires	3
Reports - Descriptive	2
Speeches/Meeting Papers	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Information Analyses	1

Education Level

Higher Education	17
Postsecondary Education	14
Elementary Secondary Education	1

Audience

Location

Turkey	3
Australia	2
Canada	2
Greece	2
Asia	1
Brazil	1
China	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Hawaii	1
Ireland	1
Ireland (Dublin)	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Draw a Person Test	1
Hamilton Rating Scale for…	1
SAT (College Admission Test)	1
Students Evaluation of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Engaging Classroom Observation: A Brief Measure of Active Learning in the College Classroom

Peer reviewed

Direct link

Chase Young; Benjamin Mitchell-Yellin; George Kevin Randall – Active Learning in Higher Education, 2025

The purpose of this study was to develop a valid, reliable, and brief measure of active learning in college classrooms that is cheap and easy to complete and yields results that faculty can easily use to inform their development as instructors. Initial construct and face validity was achieved by modifying existing instruments and creating a draft…

Descriptors: College Faculty, College Students, Active Learning, Classroom Observation Techniques

Synthesizing Validity and Reliability Evidence for the Draw-A-Scientist Test

Peer reviewed
PDF on ERIC

Download full text

Julia Brochey-Taylor; Joseph A. Taylor – Educational Research and Reviews, 2024

The purpose of this synthesis study was to assess the reliability and validity of the Draw-A-Scientist Test (DAST) and its variations across multiple studies, aiming to understand limitations and propose modifications for future application within and beyond the science domain. Given the existence of multiple DAST versions, this study quantified…

Descriptors: Cognitive Tests, Freehand Drawing, Personality Measures, Projective Measures

Can AI Grade Like a Human? Validity, Reliability, and Fairness in University Coursework Assessment

Peer reviewed
PDF on ERIC

Download full text

Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025

Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…

Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading

Assessing Movement Competence and Screening for Injury Risk in 8-12-Year-Old Children: Reliability of the Child-Focused Injury Risk Screening Tool (ChildFIRST)

Peer reviewed

Direct link

Miller, Matthew B.; Jimenez-Garcia, John Alexander; Hong, Chang Ki; DeMont, Richard – Measurement in Physical Education and Exercise Science, 2020

The Child-Focused Injury Risk Screening Tool (ChildFIRST) is a process-based assessment including 10 movement skills with 4 associated evaluation criteria. The ChildFIRST has been validated by a group of experts to evaluate movement competence and injury risk in 8-12-year-olds. The purpose of this study is to evaluate the reliability of the…

Descriptors: Screening Tests, Risk Assessment, Injuries, Psychomotor Skills

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

Comparison of Integrated Testlet and Constructed-Response Question Formats

Peer reviewed

Direct link

Slepkov, Aaron D.; Shiell, Ralph C. – Physical Review Special Topics - Physics Education Research, 2014

Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…

Descriptors: Science Tests, Physics, Responses, Multiple Choice Tests

Peer Grading in a MOOC: Reliability, Validity, and Perceived Effects

Peer reviewed
PDF on ERIC

Download full text

Luo, Heng; Robinson, Anthony C.; Park, Jae-Young – Journal of Asynchronous Learning Networks, 2014

Peer grading affords a scalable and sustainable way of providing assessment and feedback to a massive student population, and has been used in massive open online courses (MOOCs) on the Coursera platform. However, currently there is little empirical evidence to support the credentials of peer grading as a learning assessment method in the MOOC…

Descriptors: Peer Evaluation, Online Courses, Open Education, Learning Experience

An Intervention and Assessment to Improve Information Literacy

Direct link

Scharf, Davida – ProQuest LLC, 2013

Purpose: The goal of the study was to test an intervention using a brief essay as an instrument for evaluating higher-order information literacy skills in college students, while accounting for prior conditions such as socioeconomic status and prior academic achievement, and identify other predictors of information literacy through an evaluation…

Descriptors: Information Literacy, Intervention, Student Evaluation, College Students

Standardising Assessment to Meet Student Needs in Foreign Language Modules in a University Context: Is Standardisation Possible?

Peer reviewed

Direct link

Nunan, Anna – Language Learning in Higher Education, 2014

The Applied Language Centre at University College Dublin offers foreign language modules to students in ten languages at CEFR [Common European Framework of Reference for Languages] levels ranging from A1 to B2. Efforts have been underway in the Centre to standardise the assessment components across languages to ensure parity between module credits…

Descriptors: Second Language Learning, Second Language Instruction, College Students, Standards

Diagnosing the English Speaking Ability of College Students in China -- Validation of the Diagnostic College English Speaking Test

Direct link

Zhao, Zhongbao – RELC Journal: A Journal of Language Teaching and Research, 2013

This study investigates the validity of the Diagnostic College English Speaking Test (DCEST) in the context of EFL teaching and learning in China. The experiment was conducted in three stages over the course of eight weeks at a national key university in China. By means of test administration and questionnaire survey, the researcher gathered…

Descriptors: Oral Language, Construct Validity, Language Tests, Diagnostic Tests

A Reliability Analysis of Goal Attainment Scaling (GAS) Weights

Peer reviewed

Direct link

Marson, Stephen M.; Wei, Guo; Wasserman, Deborah – American Journal of Evaluation, 2009

Goal attainment scaling (GAS) has been considered to be one of the most versatile and appealing evaluation protocols available for human services. Aspects of the protocol that make the method so appealing to practitioners--that is, collaboratively working with individual clients to identify and assign weights to goals they will work to…

Descriptors: Human Services, Scaling, Test Reliability, Interrater Reliability

Measuring the Academic Skills of University Students: Evaluation of a Diagnostic Procedure

Peer reviewed

Direct link

Erling, Elizabeth J.; Richardson, John T. E. – Assessing Writing, 2010

Measuring the Academic Skills of University Students is a procedure developed in the 1990s at the University of Sydney's Language Centre to identify students in need of academic writing development by assessing examples of their written work against five criteria. This paper reviews the literature relating to the development of the procedure with…

Descriptors: Foreign Countries, Writing Evaluation, Assignments, Psychometrics

Hamilton Rating Scale for Depression: Reliability and Validity of Judgments of Novice Raters.

Peer reviewed

O'Hara, Michael W.; Rehm, Lynn P. – Journal of Consulting and Clinical Psychology, 1983

Used the intraclass correlation coefficient to estimate the interrater reliability of judgments of clinician and novice raters of depressed females (N=20) who took the Hamilton Rating Scale for Depression (HRSD). Expert and student raters both made reliable ratings on the HRSD. Criterion validity for student raters was also satisfactory.…

Descriptors: College Students, Comparative Testing, Cost Effectiveness, Counselor Role

Assessing the Writing of Deaf College Students: Reevaluating a Direct Assessment of Writing

Peer reviewed

Direct link

Schley, Sara; Albertini, John – Journal of Deaf Studies and Deaf Education, 2005

The NTID Writing Test was developed to assess the writing ability of postsecondary deaf students entering the National Technical Institute for the Deaf and to determine their appropriate placement into developmental writing courses. While previous research (Albertini et al., 1986; Albertini et al., 1996; Bochner, Albertini, Samar, & Metz, 1992)…

Descriptors: Deafness, Writing Ability, Writing Tests, College Students

Previous Page | Next Page »

Pages: 1 | 2

Active Learning in Higher…	1
American Journal of Evaluation	1
Assessing Writing	1
ETS Research Report Series	1
Educational Process:…	1
Educational Research and…	1
International Association for…	1
International Journal of…	1
International Journal of…	1
Journal of Asynchronous…	1
Journal of Consulting and…	1
Journal of Deaf Studies and…	1
Journal of Educational…	1
Language Learning	1
Language Learning in Higher…	1
Measurement in Physical…	1
Online Submission	1
Physical Review Special…	1
ProQuest LLC	1
RELC Journal: A Journal of…	1
Turkish Online Journal of…	1
More ▼

Aydin, Selami	2
Albertini, John	1
Bachman, Lyle F.	1
Benjamin Mitchell-Yellin	1
Chase Young	1
DeMont, Richard	1
Erling, Elizabeth J.	1
George Kevin Randall	1
Georgios Zacharis	1
Hafner, John C.	1
Hafner, Patti M.	1
Hamid Mohammadi	1
Hong, Chang Ki	1
Jimenez-Garcia, John Alexander	1
Joseph A. Taylor	1
Julia Brochey-Taylor	1
Liu, Ou Lydia	1
Luo, Heng	1
Mark J. Gierl	1
Marsh, Herbert W.	1
Marson, Stephen M.	1
McKee, Barbara G.	1
Miller, Matthew B.	1
Nunan, Anna	1
More ▼