NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 3 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I. – 2002
This paper explores the application of a technique for hierarchical item response theory (IRT) calibration of complex constructed response tasks that has promise both as a calibration tool and as a means of evaluating the isomorphic equivalence of complex constructed response tasks. Isomorphic tasks are explicitly and rigorously designed to be…
Descriptors: Bayesian Statistics, Constructed Response, Estimation (Mathematics), Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2006
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
Descriptors: Models, Educational Assessment, Diagnostic Tests, Evaluation Methods