Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 5 |
Descriptor
| Interrater Reliability | 6 |
| Performance Based Assessment | 6 |
| Validity | 6 |
| Error of Measurement | 2 |
| Item Response Theory | 2 |
| Psychometrics | 2 |
| Rating Scales | 2 |
| Scores | 2 |
| Alternative Assessment | 1 |
| Certification | 1 |
| Chinese | 1 |
| More ▼ | |
Source
| Assessment in Education:… | 1 |
| International Journal of… | 1 |
| Journal of Early Intervention | 1 |
| Journal of Research in Music… | 1 |
| Language Assessment Quarterly | 1 |
Author
| Barton, Erin E. | 1 |
| Beltyukova, Svetlana | 1 |
| Chen, Ching-I | 1 |
| Fox, Christine M. | 1 |
| Han, Chao | 1 |
| Hay, Peter J. | 1 |
| Macdonald, Doune | 1 |
| Musselwhite, Dorothy J. | 1 |
| Pribble, Lois | 1 |
| Rudner, Lawrence M. | 1 |
| Stone, Gregory Ethan | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 5 |
| Reports - Research | 3 |
| Reports - Evaluative | 2 |
| Tests/Questionnaires | 2 |
| ERIC Digests in Full Text | 1 |
| ERIC Publications | 1 |
Education Level
| Higher Education | 2 |
| Early Childhood Education | 1 |
| Postsecondary Education | 1 |
| Preschool Education | 1 |
| Secondary Education | 1 |
Audience
Location
| China (Beijing) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Musselwhite, Dorothy J.; Wesolowski, Brian C. – Journal of Research in Music Education, 2018
The purpose of this study was to evaluate the psychometric quality (i.e., validity and reliability) of a rating scale to assess pre-service teachers' lesson plan development in the context of secondary-level music performance classrooms. The research questions that guided this study include: (1) What items demonstrate acceptable model fit for the…
Descriptors: Psychometrics, Likert Scales, Preservice Teachers, Lesson Plans
Han, Chao – Language Assessment Quarterly, 2016
As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…
Descriptors: Foreign Countries, Scores, English, Chinese
Barton, Erin E.; Pribble, Lois; Chen, Ching-I – Journal of Early Intervention, 2013
Three studies are described that examined the relation between performance-based (PB) feedback delivered via e-mail and preschool teachers' use of recommended practices. The authors conducted the first two studies in the same classroom with different classroom staff. The third study was conducted with three different teachers employed in…
Descriptors: Electronic Mail, Feedback (Response), Preschool Teachers, Teaching Methods
Hay, Peter J.; Macdonald, Doune – Assessment in Education: Principles, Policy & Practice, 2008
This paper draws on semi-structured interview data and participant observations of senior secondary Physical Education (PE) teachers and students at two school sites across 20 weeks of the school year. The data indicated that the teachers in this study made progressive judgements about students' level of achievement across each unit of work…
Descriptors: Secondary School Teachers, Evaluative Thinking, Physical Education, Secondary School Students
Stone, Gregory Ethan; Beltyukova, Svetlana; Fox, Christine M. – International Journal of Testing, 2008
Judge-mediated examinations are defined as those for which expert evaluation (using rubrics) is required to determine correctness, completeness, and reasonability of test-taker responses. The use of multifaceted Rasch modeling has led to improvements in the reliability of scoring such examinations. The establishment of criterion-referenced…
Descriptors: Interrater Reliability, High Stakes Tests, Standard Setting, Minimum Competencies
Rudner, Lawrence M. – 1992
Several common sources of error in assessment that depends on the use of judges are identified, and ways to reduce the impact of rating errors are examined. Numerous threats to the validity of scores based on ratings exist. These threats include: (1) the halo effect; (2) stereotyping; (3) perception differences; (4) leniency/stringency error; and…
Descriptors: Alternative Assessment, Error of Measurement, Evaluation Methods, Evaluators

Peer reviewed
Direct link
