NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 23 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
Direct linkDirect link
Chase Young; Benjamin Mitchell-Yellin; George Kevin Randall – Active Learning in Higher Education, 2025
The purpose of this study was to develop a valid, reliable, and brief measure of active learning in college classrooms that is cheap and easy to complete and yields results that faculty can easily use to inform their development as instructors. Initial construct and face validity was achieved by modifying existing instruments and creating a draft…
Descriptors: College Faculty, College Students, Active Learning, Classroom Observation Techniques
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Julia Brochey-Taylor; Joseph A. Taylor – Educational Research and Reviews, 2024
The purpose of this synthesis study was to assess the reliability and validity of the Draw-A-Scientist Test (DAST) and its variations across multiple studies, aiming to understand limitations and propose modifications for future application within and beyond the science domain. Given the existence of multiple DAST versions, this study quantified…
Descriptors: Cognitive Tests, Freehand Drawing, Personality Measures, Projective Measures
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025
Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…
Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading
Peer reviewed Peer reviewed
Direct linkDirect link
Miller, Matthew B.; Jimenez-Garcia, John Alexander; Hong, Chang Ki; DeMont, Richard – Measurement in Physical Education and Exercise Science, 2020
The Child-Focused Injury Risk Screening Tool (ChildFIRST) is a process-based assessment including 10 movement skills with 4 associated evaluation criteria. The ChildFIRST has been validated by a group of experts to evaluate movement competence and injury risk in 8-12-year-olds. The purpose of this study is to evaluate the reliability of the…
Descriptors: Screening Tests, Risk Assessment, Injuries, Psychomotor Skills
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Slepkov, Aaron D.; Shiell, Ralph C. – Physical Review Special Topics - Physics Education Research, 2014
Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…
Descriptors: Science Tests, Physics, Responses, Multiple Choice Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Luo, Heng; Robinson, Anthony C.; Park, Jae-Young – Journal of Asynchronous Learning Networks, 2014
Peer grading affords a scalable and sustainable way of providing assessment and feedback to a massive student population, and has been used in massive open online courses (MOOCs) on the Coursera platform. However, currently there is little empirical evidence to support the credentials of peer grading as a learning assessment method in the MOOC…
Descriptors: Peer Evaluation, Online Courses, Open Education, Learning Experience
Scharf, Davida – ProQuest LLC, 2013
Purpose: The goal of the study was to test an intervention using a brief essay as an instrument for evaluating higher-order information literacy skills in college students, while accounting for prior conditions such as socioeconomic status and prior academic achievement, and identify other predictors of information literacy through an evaluation…
Descriptors: Information Literacy, Intervention, Student Evaluation, College Students
Peer reviewed Peer reviewed
Direct linkDirect link
Nunan, Anna – Language Learning in Higher Education, 2014
The Applied Language Centre at University College Dublin offers foreign language modules to students in ten languages at CEFR [Common European Framework of Reference for Languages] levels ranging from A1 to B2. Efforts have been underway in the Centre to standardise the assessment components across languages to ensure parity between module credits…
Descriptors: Second Language Learning, Second Language Instruction, College Students, Standards
Zhao, Zhongbao – RELC Journal: A Journal of Language Teaching and Research, 2013
This study investigates the validity of the Diagnostic College English Speaking Test (DCEST) in the context of EFL teaching and learning in China. The experiment was conducted in three stages over the course of eight weeks at a national key university in China. By means of test administration and questionnaire survey, the researcher gathered…
Descriptors: Oral Language, Construct Validity, Language Tests, Diagnostic Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Marson, Stephen M.; Wei, Guo; Wasserman, Deborah – American Journal of Evaluation, 2009
Goal attainment scaling (GAS) has been considered to be one of the most versatile and appealing evaluation protocols available for human services. Aspects of the protocol that make the method so appealing to practitioners--that is, collaboratively working with individual clients to identify and assign weights to goals they will work to…
Descriptors: Human Services, Scaling, Test Reliability, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Erling, Elizabeth J.; Richardson, John T. E. – Assessing Writing, 2010
Measuring the Academic Skills of University Students is a procedure developed in the 1990s at the University of Sydney's Language Centre to identify students in need of academic writing development by assessing examples of their written work against five criteria. This paper reviews the literature relating to the development of the procedure with…
Descriptors: Foreign Countries, Writing Evaluation, Assignments, Psychometrics
Peer reviewed Peer reviewed
O'Hara, Michael W.; Rehm, Lynn P. – Journal of Consulting and Clinical Psychology, 1983
Used the intraclass correlation coefficient to estimate the interrater reliability of judgments of clinician and novice raters of depressed females (N=20) who took the Hamilton Rating Scale for Depression (HRSD). Expert and student raters both made reliable ratings on the HRSD. Criterion validity for student raters was also satisfactory.…
Descriptors: College Students, Comparative Testing, Cost Effectiveness, Counselor Role
Peer reviewed Peer reviewed
Direct linkDirect link
Schley, Sara; Albertini, John – Journal of Deaf Studies and Deaf Education, 2005
The NTID Writing Test was developed to assess the writing ability of postsecondary deaf students entering the National Technical Institute for the Deaf and to determine their appropriate placement into developmental writing courses. While previous research (Albertini et al., 1986; Albertini et al., 1996; Bochner, Albertini, Samar, & Metz, 1992)…
Descriptors: Deafness, Writing Ability, Writing Tests, College Students
Previous Page | Next Page »
Pages: 1  |  2