NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Joseph H. Grochowalski; Lei Wan; Lauren Molin; Amy H. Hendrickson – Journal of Educational Measurement, 2025
The Beuk standard setting method derives cut scores through expert judgment that balances content and normative perspectives. This study developed a method to estimate confidence intervals for Beuk settings and assessed their accuracy via simulations. Simulations varied SME panel size, expert agreement, cut score locations, score distributions,…
Descriptors: Cutting Scores, Standard Setting, Accuracy, Statistical Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Leonie Fleck; Dorothee Amelung; Anna Fuchs; Benjamin Mayer; Malvin Escher; Lena Listunova; Jobst-Hendrik Schultz; Andreas Möltner; Clara Schütte; Tim Wittenberg; Isabella Schneider; Sabine C. Herpertz – Advances in Health Sciences Education, 2025
Doctors' interactional competencies play a crucial role in patient satisfaction, well-being, and compliance. Accordingly, it is in medical schools' interest to select candidates with strong interactional abilities. While Multiple Mini Interviews (MMIs) provide a useful context to assess such abilities, the evaluation of candidate performance…
Descriptors: Medical Students, Medical Schools, College Admission, Admission Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Martinková, Patrícia; Bartoš, František; Brabec, Marek – Journal of Educational and Behavioral Statistics, 2023
Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater's or ratee's gender, major, or experience. Identification of such heterogeneity sources in IRR is important for the implementation of policies with the potential to decrease measurement error…
Descriptors: Interrater Reliability, Bayesian Statistics, Statistical Inference, Hierarchical Linear Modeling
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Marzieh Pashmdarfard; Afsoon Hassani Mehraban; Narges Shafaroodi; Kamran Soltani Arabshahi; Soroor Parvizy; Akram Azad; Samaneh Karamali Esmaeili – Journal of Occupational Therapy Education, 2022
Fieldwork education is an integral part of the educational process in occupational therapy and assessing student competency at the end of fieldwork is important. The aim of this study was to design and conduct an Objective Structured Clinical Examination (OSCE) based on the Occupational Therapy Practice Framework (OTPF) for occupational therapy…
Descriptors: Occupational Therapy, Allied Health Occupations Education, Test Construction, Test Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bosch, Nigel; Paquette, Luc – Journal of Learning Analytics, 2018
Metrics including Cohen's kappa, precision, recall, and F[subscript 1] are common measures of performance for models of discrete student states, such as a student's affect or behaviour. This study examined discrete model metrics for previously published student model examples to identify situations where metrics provided differing perspectives on…
Descriptors: Models, Comparative Analysis, Prediction, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Sacristan, Dolly; Martinez, Colleen D. – Journal of Teaching in Social Work, 2023
Social work educators are compelled to use reliable and valid methods to assess student learning outcomes. This study adapted a clinical simulation by integrating traditional role-play of case scenarios and elements of the Objective Structured Clinical Examination, which is often used to assess students' practice skills. Master of Social Work…
Descriptors: Graduate Students, Counselor Training, Masters Programs, Clinical Experience
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wilhelm, Anne Garrison; Gillespie Rouse, Amy; Jones, Francesca – Practical Assessment, Research & Evaluation, 2018
Although inter-rater reliability is an important aspect of using observational instruments, it has received little theoretical attention. In this article, we offer some guidance for practitioners and consumers of classroom observations so that they can make decisions about inter-rater reliability, both for study design and in the reporting of data…
Descriptors: Interrater Reliability, Measurement, Observation, Educational Research
Peer reviewed Peer reviewed
Direct linkDirect link
Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018
Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…
Descriptors: Competence, Simulation, Allied Health Personnel, Certification
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators