Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Interrater Reliability | 14 |
| Models | 14 |
| Reliability | 14 |
| Validity | 7 |
| Academic Achievement | 5 |
| Scores | 4 |
| Standards | 3 |
| Accountability | 2 |
| At Risk Students | 2 |
| Classification | 2 |
| College Students | 2 |
| More ▼ | |
Source
Author
| Goe, Laura | 2 |
| Holdheide, Lynn | 2 |
| Miller, Tricia | 2 |
| Aljunied, Mariam | 1 |
| Balan, Andreia | 1 |
| Blondin, Carolyn A. | 1 |
| Cason, Carolyn L. | 1 |
| Cropley, David H. | 1 |
| Fisher, Steven P. | 1 |
| Frederickson, Norah | 1 |
| Galyon, Charles E. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 10 |
| Reports - Research | 10 |
| Guides - Non-Classroom | 2 |
| Reports - Descriptive | 2 |
| Speeches/Meeting Papers | 2 |
| Opinion Papers | 1 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 2 |
| High Schools | 1 |
Audience
| Researchers | 2 |
| Policymakers | 1 |
Location
| California | 1 |
| Singapore | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018
Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…
Descriptors: Grading, Models, Reliability, Validity
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Seiverling, Laura; Harclerode, Whitney; Williams, Keith – Education and Treatment of Children, 2014
The purpose of this study was to examine if sequential presentation with feeder modeling would lead to an increase in bites accepted of new foods compared to sequential presentation without feeder modeling in a typically developing 4-year-old boy with food selectivity. The participant's acceptance of novel foods increased both in the modeling and…
Descriptors: Food, Behavior Modification, Toddlers, Males
Blondin, Carolyn A.; Voils, Kyle; Galyon, Charles E.; Williams, Robert L. – Journal on Excellence in College Teaching, 2015
Concepts from the Response-to-Intervention (RTI) Model were used to promote a successful course outcome for students at risk for making low grades in an entry-level college course. The first exam served as a universal screener to identify students who could potentially benefit from RTI assistance. The researchers developed a tiered coaching…
Descriptors: Response to Intervention, Models, At Risk Students, Coaching (Performance)
Aljunied, Mariam; Frederickson, Norah – Educational Psychology in Practice, 2014
Despite embracing a bio-psycho-social perspective, the World Health Organization's International Classification of Functioning, Disability and Health (ICF) assessment framework has had limited application to date with children who have special educational needs (SEN). This study examines its utility for educational psychologists' work with…
Descriptors: Educational Psychology, Classification, Clinical Diagnosis, Special Needs Students
Young, Karen; James, Kimberley; Noy, Sue – Asia-Pacific Journal of Cooperative Education, 2016
Work integrated learning (WIL) educators using reflective practice to facilitate student learning require a set of standards that works within the traditional assessment frame of Higher Education, to ascertain the level at which reflective practice has been demonstrated. However, there is a paucity of tested assessment instruments that provide…
Descriptors: Work Experience Programs, Reflection, Student Evaluation, Scoring Rubrics
Kilgus, Stephen P.; Sims, Wesley A.; von der Embse, Nathaniel P.; Riley-Tillman, T. Chris – School Psychology Quarterly, 2015
The purpose of this investigation was to evaluate the models for interpretation and use that serve as the foundation of an interpretation/use argument for the Social and Academic Behavior Risk Screener (SABRS). The SABRS was completed by 34 teachers with regard to 488 students in a Midwestern high school during the winter portion of the academic…
Descriptors: Screening Tests, Factor Analysis, Models, High School Students
Goe, Laura; Holdheide, Lynn; Miller, Tricia – Center on Great Teachers and Leaders, 2014
Across the nation, states and districts are in the process of building better teacher evaluation systems that not only identify highly effective teachers but also systematically provide data and feedback that can be used to improve teacher practice. The "Practical Guide to Designing Comprehensive Teacher Evaluation Systems" is a tool…
Descriptors: Teacher Evaluation, Evaluators, Educational Change, Accountability
Cropley, David H.; Kaufman, James C. – Journal of Creative Behavior, 2012
The Creative Solution Diagnosis Scale (CSDS) is a 30-item scale based on a core of four criteria: Relevance & Effectiveness, Novelty, Elegance, and Genesis. The CSDS offers potential for the consensual assessment of functional product creativity. This article describes an empirical study in which non-expert judges rated a series of mousetrap…
Descriptors: Expertise, Creativity, Identification, Measures (Individuals)
Wanstreet, Constance E.; Stein, David S. – American Journal of Distance Education, 2011
This study investigated the small-group, learner-led discussion process in synchronous discussions. Transcripts from online chats and face-to-face discussions were analyzed within the context of the Community of Inquiry framework to examine the relationship of teaching presence, social presence, and cognitive presence to one another and for…
Descriptors: Computer Mediated Communication, Inquiry, Models, Asynchronous Communication
Goe, Laura; Holdheide, Lynn; Miller, Tricia – National Comprehensive Center for Teacher Quality, 2011
Across the nation, states and districts are in the process of building better teacher evaluation systems that not only identify highly effective teachers but also systematically provide data and feedback that can be used to improve teacher practice. "A Practical Guide to Designing Comprehensive Teacher Evaluation Systems" is a tool…
Descriptors: Feedback (Response), Teacher Effectiveness, Evaluators, Teacher Evaluation
Johnson, Robert L.; Penny, James; Gordon, Belita; Shumate, Steven R.; Fisher, Steven P. – Language Assessment Quarterly, 2005
Many studies have indicated that at least 2 raters should score writing assessments to improve interrater reliability. However, even for assessments that characteristically demonstrate high levels of rater agreement, 2 raters of the same essay can occasionally report different, or discrepant, scores. If a single score, typically referred to as an…
Descriptors: Interrater Reliability, Scores, Evaluation, Reliability
Peer reviewedSuen, Hoi K.; And Others – Journal of Early Intervention, 1995
This paper suggests that in addressing the issue of parent-professional congruence in child assessment, researchers should avoid focusing on the conventional aspects of interrater reliability and rater interchangeability, but rather should focus on the reliability of the pooled assessment information from parents and professionals. A…
Descriptors: Disabilities, Early Childhood Education, Early Intervention, Evaluation Methods
Cason, Carolyn L.; And Others – 1986
Cason and Cason's model of performance rating was used to determine the extent to which variation in reviewer standards affected the reliability and validity of the program review process used to select papers for inclusion in the annual program. Data analyzed were the overall recommendation for acceptance and ratings on seven quality criteria…
Descriptors: Conference Papers, Data Analysis, Educational Research, Evaluation Criteria

Direct link
