NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 23 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025
As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…
Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tülübas, Tijen; Demirkol, Murat; Ozdemir, Tuncay Yavuz; Polat, Hakan; Karakose, Turgut; Yirci, Ramazan – Educational Process: International Journal, 2023
Background/purpose: ChatGPT, a recent form of AI-based language model, have garnered interest among people from diverse backgrounds with its immersive capabilities. Using ChatGPT to support or generate scientific research has also created an ongoing debate over its advantages versus risks. The present study aimed to conduct an AI-enabled research…
Descriptors: Artificial Intelligence, Emergency Programs, Distance Education, COVID-19
Peer reviewed Peer reviewed
Direct linkDirect link
Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024
This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…
Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy
Benton, Tom; Leech, Tony; Hughes, Sarah – Cambridge Assessment, 2020
In the context of examinations, the phrase "maintaining standards" usually refers to any activity designed to ensure that it is no easier (or harder) to achieve a given grade in one year than in another. Specifically, it tends to mean activities associated with setting examination grade boundaries. Benton et al (2020) describes a method…
Descriptors: Mathematics Tests, Equated Scores, Comparative Analysis, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Manzano, Dexter L. – International Journal of Language Testing, 2022
The increasing popularity of self-assessment prompted several scholars to investigate its effectiveness and accuracy in relation to teacher assessment. However, most of these studies focused only on the consistency estimate perspective. Thus, the current study investigated the interrater reliability between self- and teacher assessment of…
Descriptors: Oral Language, Self Evaluation (Individuals), College Students, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Armijo-Olivo, Susan; Craig, Rodger; Campbell, Sandy – Research Synthesis Methods, 2020
Background: Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is…
Descriptors: Comparative Analysis, Computer Software, Decision Making, Randomized Controlled Trials
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bosch, Nigel; Paquette, Luc – Journal of Learning Analytics, 2018
Metrics including Cohen's kappa, precision, recall, and F[subscript 1] are common measures of performance for models of discrete student states, such as a student's affect or behaviour. This study examined discrete model metrics for previously published student model examples to identify situations where metrics provided differing perspectives on…
Descriptors: Models, Comparative Analysis, Prediction, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019
Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…
Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials
Wagner, Kyle; Smith, Alex; Allen, Abigail; McMaster, Kristen; Poch, Apryl; Lembke, Erica – Assessment for Effective Intervention, 2019
Researchers and practitioners have questioned whether scoring procedures used with curriculum-based measures of writing (CBM-W) capture growth in complexity of writing. We analyzed data from six independent samples to examine two potential scoring metrics for picture word CBM-W (PW), a sentence-level CBM task. Correct word sequences per response…
Descriptors: Curriculum Based Assessment, Writing Evaluation, Comparative Analysis, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Morris, Darrell; Pennell, Ashley M.; Perney, Jan; Trathen, Woodrow – Reading Psychology, 2018
This study compared reading rate to reading fluency (as measured by a rating scale). After listening to first graders read short passages, we assigned an overall fluency rating (low, average, or high) to each reading. We then used predictive discriminant analyses to determine which of five measures--accuracy, rate (objective); accuracy, phrasing,…
Descriptors: Reading Fluency, Prediction, Grade 1, Elementary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Thawabieh, Ahmad M. – Journal of Curriculum and Teaching, 2017
This study aimed to compare between the students' self-assessment and teachers' assessment. The study sample consisted of 71 students at Tafila Technical University studying Introduction to Psychology course. The researcher used 2 students' self-assessment tools and 2 tests. The results indicated that students can assess themselves accurately if…
Descriptors: Comparative Analysis, Self Evaluation (Individuals), Student Evaluation, Psychology
Peer reviewed Peer reviewed
Direct linkDirect link
Choo, Dawn; Dettman, Shani J. – Deafness & Education International, 2016
During the pre- and post-implant habilitation process, mothers of children using cochlear implants may be coached by clinicians to use appropriate communicative strategies during play according to the family's choice of communication approach. The present study compared observations made by experienced and inexperienced individuals in the analysis…
Descriptors: Parent Child Relationship, Mothers, Video Technology, Observation
Peer reviewed Peer reviewed
Direct linkDirect link
Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016
As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…
Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Attali, Yigal – Language Testing, 2016
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…
Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators
Previous Page | Next Page »
Pages: 1  |  2