Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 1 |
Descriptor
| Bayesian Statistics | 3 |
| Evaluation Methods | 3 |
| Test Construction | 3 |
| Models | 2 |
| Artificial Intelligence | 1 |
| Comparative Analysis | 1 |
| Computer Simulation | 1 |
| Computer Software | 1 |
| Constructed Response | 1 |
| Decision Making | 1 |
| Diagnostic Tests | 1 |
| More ▼ | |
Author
| Sinharay, Sandip | 2 |
| Bejar, Isaac I. | 1 |
| Johnson, Matthew S. | 1 |
| Piech, Chris | 1 |
| Tack, Anaïs | 1 |
| Williamson, David M. | 1 |
Publication Type
| Reports - Research | 2 |
| Speeches/Meeting Papers | 2 |
| Journal Articles | 1 |
| Reports - Evaluative | 1 |
Education Level
| Middle Schools | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I. – 2002
This paper explores the application of a technique for hierarchical item response theory (IRT) calibration of complex constructed response tasks that has promise both as a calibration tool and as a means of evaluating the isomorphic equivalence of complex constructed response tasks. Isomorphic tasks are explicitly and rigorously designed to be…
Descriptors: Bayesian Statistics, Constructed Response, Estimation (Mathematics), Evaluation Methods
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2006
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
Descriptors: Models, Educational Assessment, Diagnostic Tests, Evaluation Methods

Peer reviewed
Direct link
