Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 3 |
Descriptor
| Computer Software | 4 |
| Interrater Reliability | 4 |
| Test Construction | 4 |
| Scoring Rubrics | 3 |
| College Faculty | 2 |
| Graduate Students | 2 |
| Instructional Design | 2 |
| Instructional Material… | 2 |
| Internet | 2 |
| Statistical Analysis | 2 |
| Teacher Surveys | 2 |
| More ▼ | |
Publication Type
| Reports - Research | 4 |
| Journal Articles | 2 |
| Speeches/Meeting Papers | 2 |
Education Level
| Elementary Secondary Education | 2 |
| Higher Education | 2 |
| Postsecondary Education | 2 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Unal, Zafer; Bodur, Yasar; Unal, Aslihan – Contemporary Issues in Technology and Teacher Education (CITE Journal), 2012
The researchers in this study undertook development of a webquest evaluation rubric and investigated its reliability. The rubric was created using the strengths of the currently available webquest rubrics with improvements based on the comments provided in the literature and feedback received from educators. After the rubric was created, 23…
Descriptors: Test Construction, Test Reliability, Instructional Material Evaluation, Scoring Rubrics
Unal, Zafer; Bodur, Yasar; Unal, Aslihan – Journal of Information Technology Education: Research, 2012
Current literature provides many examples of rubrics that are used to evaluate the quality of web-quest designs. However, reliability of these rubrics has not yet been researched. This is the first study to fully characterize and assess the reliability of a webquest evaluation rubric. The ZUNAL rubric was created to utilize the strengths of the…
Descriptors: Scoring Rubrics, Test Reliability, Test Construction, Evaluation Criteria
Stahl, John; And Others – 1996
On-line performance assessment was developed to maximize the usefulness of performance assessment and to minimize the time and labor costs incurred. This paper reports on the development of an on-line performance assessment instrument, focusing on the establishment and validation of the scoring rubric and its implementation in the Rasch model, the…
Descriptors: Computer Software, Computer Software Development, Cost Effectiveness, Interrater Reliability

Peer reviewed
Direct link
