NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20254
Since 202422
Audience
Laws, Policies, & Programs
Head Start1
Assessments and Surveys
What Works Clearinghouse Rating
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Haoze Du; Richard Li; Edward Gehringer – International Educational Data Mining Society, 2025
Evaluating the performance of Large Language Models (LLMs) is a critical yet challenging task, particularly when aiming to avoid subjective assessments. This paper proposes a framework for leveraging subjective metrics derived from the class textual materials across different semesters to assess LLM outputs across various tasks. By utilizing…
Descriptors: Artificial Intelligence, Performance, Evaluation, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Liyang Sun; Eli Ben-Michael; Avi Feller – Grantee Submission, 2024
The synthetic control method (SCM) is a popular approach for estimating the impact of a treatment on a single unit with panel data. Two challenges arise with higher frequency data (e.g., monthly versus yearly): (1) achieving excellent pre-treatment fit is typically more challenging; and (2) overfitting to noise is more likely. Aggregating data…
Descriptors: Evaluation Methods, Comparative Analysis, Computation, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Bianca Montrosse-Moorhead; Amanda Sutter; Chisomo Phiri – AERA Online Paper Repository, 2024
Evaluators widely agree that the involvement of key actors is a central aspect of quality practice. However, a scoping review of to what extent and how youth are included has not been explored. This is an important omission because many evaluations are done on programs that directly serve youth. We examined 159 evaluation reports in a publicly…
Descriptors: Literature Reviews, Youth, Inclusion, Program Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Mari Ueda; Yuichi Tsumoto; Tetsuo Tanaka – International Association for Development of the Information Society, 2024
According to architectural designers, although they are aware of the sound environment when designing spaces, in many cases visual (design) and cost (cost-effectiveness, etc.) were the predominant factors. In many cases, the visual (design) and cost aspects (cost-effectiveness, etc.) were dominant. The tacit rule for evaluating the sound…
Descriptors: Foreign Countries, Acoustics, Architecture, Architectural Education
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Nhat Tran; Benjamin Pierce; Diane Litman; Richard Correnti; Lindsay Clare Matsumura – International Educational Data Mining Society, 2024
Automatically assessing classroom discussion quality is becoming increasingly feasible with the help of new NLP advancements such as large language models (LLMs). In this work, we examine how the assessment performance of 2 LLMs interacts with 3 factors that may affect performance: task formulation, context length, and few-shot examples. We also…
Descriptors: Artificial Intelligence, Technology Uses in Education, Discussion (Teaching Technique), Language Arts
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ramashego Mphahlele – Journal of Learning for Development, 2024
This paper reviews 38 studies conducted between 2015 and 2022 on collaborative assessments in open-distance and e-learning (ODeL) contexts, focusing on the benefits, types, challenges, and strategies to improve collaborative assessments. This qualitative review aims to investigate collaborative assessments within the ODeL comprehensively. The…
Descriptors: Cooperative Learning, Student Evaluation, Evaluation Methods, Distance Education
Peer reviewed Peer reviewed
Direct linkDirect link
Lisa DaVia Rubenstein; Kathrin Maki; Brianna Quigley; Shanyn Thompson; Lisa M. Ridgley Smith – AERA Online Paper Repository, 2024
The purpose of this systematic review was to survey available measures of creativity for pk12 students for assessments characteristics and reporting of psychometric properties. Using the PRISMA framework, we identified 42 unique articles with 48 assessments meeting our inclusion criteria. Then, two coders independently coded all articles using a…
Descriptors: Literature Reviews, Meta Analysis, Elementary Secondary Education, Creativity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ruikun Hou; Babette Bühler; Tim Fütterer; Efe Bozkir; Peter Gerjets; Ulrich Trautwein; Enkelejda Kasneci – International Educational Data Mining Society, 2025
Classroom discourse is an essential vehicle through which teaching and learning take place. Assessing different characteristics of discursive practices and linking them to student learning achievement enhances the understanding of teaching quality. Traditional assessments rely on manual coding of classroom observation protocols, which is…
Descriptors: Discussion (Teaching Technique), Artificial Intelligence, Technology Uses in Education, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle McNamara – International Educational Data Mining Society, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Oscar Clivio; Avi Feller; Chris Holmes – Grantee Submission, 2024
Reweighting a distribution to minimize a distance to a target distribution is a powerful and flexible strategy for estimating a wide range of causal effects, but can be challenging in practice because optimal weights typically depend on knowledge of the underlying data generating process. In this paper, we focus on design-based weights, which do…
Descriptors: Evaluation Methods, Causal Models, Error of Measurement, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Yuting Han; Zhehan Jiang; Lingling Xu; Fen Cai – AERA Online Paper Repository, 2024
To address the computational constraints of parameter estimation in the polytomous Cognitive Diagnosis Model (pCDM) in large-scale high data volume situations, this study proposes two two-stage polytomous attribute estimation methods: P_max and P_linear. The effects of the two-stage methods were studied via a Monte Carlo simulation study, and the…
Descriptors: Medical Education, Licensing Examinations (Professions), Measurement Techniques, Statistical Data
Peer reviewed Peer reviewed
Direct linkDirect link
Rusty Parker; James E. Bartlett; Michelle Elizabeth Bartlett – AERA Online Paper Repository, 2024
Pre-apprenticeship programs help youth and adults enter a Registered Apprenticeship program, further training, or the workforce. One of the challenges with starting and sustaining a pre apprenticeship program is bridging the gap between the business and educational leaders' understanding of the program. The bridge between the two sectors is a…
Descriptors: Apprenticeships, Youth Programs, Partnerships in Education, School Business Relationship
Peer reviewed Peer reviewed
PDF on ERIC Download full text
W. Jake Thompson – Grantee Submission, 2024
Diagnostic classification models (DCMs) are psychometric models that can be used to estimate the presence or absence of psychological traits, or proficiency on fine-grained skills. Critical to the use of any psychometric model in practice, including DCMs, is an evaluation of model fit. Traditionally, DCMs have been estimated with maximum…
Descriptors: Bayesian Statistics, Classification, Psychometrics, Goodness of Fit
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jeffrey Matayoshi; Eric Cosyn; Christopher Lechuga; Hasan Uzun – International Educational Data Mining Society, 2024
ALEKS is an adaptive learning and assessment system, with courses covering subjects such as math, chemistry, and statistics. In this work, we focus on the ALEKS math courses, which cover a wide range of content starting at second grade math and continuing through college-level precalculus. To help instructors and students navigate this content,…
Descriptors: Student Placement, Evaluation Methods, Elementary Secondary Education, Accuracy
Previous Page | Next Page »
Pages: 1  |  2