NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 6,225 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Hsin-Yun Lee; You-Lin Chen; Li-Jen Weng – Journal of Experimental Education, 2024
The second version of Kaiser's Measure of Sampling Adequacy (MSA[subscript 2]) has been widely applied to assess the factorability of data in psychological research. The MSA[subscript 2] is developed in the population and little is known about its behavior in finite samples. If estimated MSA[subscript 2]s are biased due to sampling errors,…
Descriptors: Error of Measurement, Reliability, Sampling, Statistical Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Marcus Messer; Neil C. C. Brown; Michael Kölling; Miaojing Shi – ACM Transactions on Computing Education, 2025
Providing consistent summative assessment to students is important, as the grades they are awarded affect their progression through university and future career prospects. While small cohorts are typically assessed by a single assessor, such as the module/class leader, larger cohorts are often assessed by multiple assessors, typically teaching…
Descriptors: Foreign Countries, Grading, Interrater Reliability, Teaching Assistants
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Sarah Stopforth; Roxanne Connelly; Vernon Gayle – Cambridge Journal of Education, 2025
Data on educational qualifications is essential in many research domains. The UK Millennium Cohort Study collected self-reported General Certificate of Secondary Education (GCSE) data in sweep 7 (cohort members aged 17). GCSE data from the National Pupil Database (NPD) has been linked to the MCS. This study investigates the consistency of these…
Descriptors: Foreign Countries, Adolescents, Case Studies, Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Matthew K. Burns; Heba Z. Abdelnaby; Jonie B. Welland; Katherine A. Graves; Kari Kurto – Assessment for Effective Intervention, 2024
The current study examined the reliability of The Reading League Curriculum-Evaluation Guidelines (CEGs), which were developed to help school-based teams rate the presence of red flags when considering adopting specific literacy curricula. Coders (n = 30) independently used the CEGs to evaluate a free online English language arts curriculum. The…
Descriptors: English Curriculum, English Instruction, Language Arts, Curriculum Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Morten Pallisgaard Støve; Mathias Kringelholt Kristensen; Jonas Nielsen; Lea Dyhrberg Madsen – Measurement in Physical Education and Exercise Science, 2025
Between limb strength, asymmetry is a leading risk factor for hamstring strain re-injury. However, few accurate testing methodologies are available in clinical settings. This study examined the validity and reliability of eccentric knee flexor torque measured with a novel Nordic Hamstring Device. Twenty-seven healthy participants were assessed in…
Descriptors: Validity, Reliability, Human Body, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Hulteen, Ryan M.; True, Larissa; Kroc, Edward – Measurement in Physical Education and Exercise Science, 2023
The typical process for assessing inter-rater reliability is facilitated by training raters within a research team. Lacking is an understanding if inter-rater reliability scores "between" research teams demonstrate adequate reliability. This study examined inter-rater reliability between 16 researchers who assessed fundamental motor…
Descriptors: Psychomotor Skills, Scores, Reliability, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Niamh Devane; Sofia Mazzoleni; Nicholas Behn; Jane Marshall; Stephanie Wilson; Katerina Hilari – International Journal of Language & Communication Disorders, 2025
Background and Aims: The reliability and validity of an intervention can be improved by checking treatment fidelity (TF). TF methods identify core components of an intervention, check their presence (or absence) and identify threats to fidelity. The Virtual Elaborated Semantic Feature Analysis (VESFA) intervention comprised individual sessions of…
Descriptors: Aphasia, Intervention, Fidelity, Feasibility Studies
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alan Huebner; Gustaf B. Skar; Mengchen Huang – Practical Assessment, Research & Evaluation, 2025
Generalizability theory is a modern and powerful framework for conducting reliability analyses. It is flexible to accommodate both random and fixed facets. However, there has been a relative scarcity in the practical literature on how to handle the fixed facet case. This article aims to provide practitioners a conceptual understanding and…
Descriptors: Generalizability Theory, Multivariate Analysis, Statistical Analysis, Writing Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025
The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…
Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Victor K. Y. Chan – International Association for Development of the Information Society, 2025
This paper extends a prior study on the consistency of generative Artificial Intelligence (AI) models in evaluating Massive Open Online Course (MOOC) platforms. While the original work focused on the consistency of direct numerical scores, this research investigates the consistency of the rankings derived from those scores. When evaluating…
Descriptors: Artificial Intelligence, MOOCs, Reliability, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Conrad Borchers – International Educational Data Mining Society, 2025
Algorithmic bias is a pressing concern in educational data mining (EDM), as it risks amplifying inequities in learning outcomes. The Area Between ROC Curves (ABROCA) metric is frequently used to measure discrepancies in model performance across demographic groups to quantify overall model fairness. However, its skewed distribution--especially when…
Descriptors: Algorithms, Bias, Statistics, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Huan Liu; Won-Chan Lee – Journal of Educational Measurement, 2025
This study investigates the estimation of classification consistency and accuracy indices for composite summed and theta scores within the SS-MIRT framework, using five popular approaches, including the Lee, Rudner, Guo, Bayesian EAP, and Bayesian MCMC approaches. The procedures are illustrated through analysis of two real datasets and further…
Descriptors: Classification, Reliability, Accuracy, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Tim Moses; YoungKoung Kim – Journal of Educational Measurement, 2025
This study considers the estimation of marginal reliability and conditional accuracy measures using a generalized recursion procedure with several IRT-based ability and score estimators. The estimators include MLE, TCC, and EAP abilities, and corresponding test scores obtained with different weightings of the item scores. We consider reliability…
Descriptors: Item Response Theory, Scoring, Reliability, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Riana Nurhayati; Suranto Aw; Siti Irene Astuti Dwiningrum; Mami Hajaroh; Herwin Herwin – International Journal of Educational Methodology, 2024
Evaluation of child-friendly school (CFS) policies is essential to determine the achievements of school efforts in reducing violence cases. This research aims to proving the reliability and validity of CFS policy evaluation instruments in elementary schools with different locations. This investigation uses the Context Input Process Product (CIPP)…
Descriptors: Validity, Reliability, School Policy, Program Evaluation
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  415