Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Ferrara, Steven F. – 1987
Statistical equating of scores from direct writing assessments is performed in Maryland to hold the difficulty of passing the Maryland Writing Test (MWT) equivalent from year to year and to allow a demonstration of growth in student writing achievement over the years. Passage of the test is a high school graduation requirement. Prior to…
Descriptors: Equated Scores, Essay Tests, Graduation Requirements, High School Students
Micceri, Theodore – 1984
This paper investigates the reliability of the Florida Performance Measurement Systems' Summative Observation instrument. Developed for the Florida Beginning Teacher Evaluation Program, it provides behavioral ratings for teachers in a classroom setting. Data came from ratings of videotapes of nine teachers conducting actual lessons by nine teams…
Descriptors: Analysis of Variance, Classroom Observation Techniques, Elementary Secondary Education, Evaluation Methods
Bobek, Becky L.; Gore, Paul A. – American College Testing (ACT), Inc., 2004
This research report describes changes made to the Inventory of Work-Relevant Values when it was revised for online use as a part of the Internet version of DISCOVER. Users will see the following differences between the online and CD-ROM versions of the inventory: 22 items rather than 61, simplified presentation, and the contribution of all items…
Descriptors: Interrater Reliability, Field Tests, Internet, Test Construction
Smith, Teresa A. – 1997
The Third International Mathematics and Science Study (TIMSS) measured mathematics and science achievement of middle school students in more than 40 countries. About one quarter of the tests' nearly 300 items were free response items requiring students to generate their own answers. Scoring these responses used a two-digit diagnostic code rubric…
Descriptors: Comparative Education, English, Error of Measurement, Foreign Countries
Peer reviewedMagill, Michael K.; And Others – Evaluation and the Health Professions, 1988
A computer-assisted interaction analysis system that describes behavior in small groups and focuses on active clinical problem-solving was developed and tested. Teacher and class characteristics are incorporated into the simulation, allowing feedback for improvement of medical teaching skills. (TJH)
Descriptors: Computer Simulation, Graduate Medical Education, Group Dynamics, Higher Education
Peer reviewedMacRae, Helen M.; And Others – Academic Medicine, 1995
A study compared two methods of rating medical students' performances on history and physical examination: using checklists completed by standardized patients (SPs) and databases completed by students. Results of each method were correlated with ratings of students by three physicians for each SP-student encounter. Results showed checklists…
Descriptors: Case Studies, Check Lists, Comparative Analysis, Databases
Peer reviewedFrederick, Brian P.; Olmi, D. Joe – Psychology in the Schools, 1994
Social interactions between children with Attention-Deficit/Hyperactivity Disorder (ADHD) and their teachers, peers, and parents are discussed. Problematic interactions may depend on social skills deficits. Changing the focus to ADHD children who are not experiencing social skills deficits may prove beneficial. A review of the previous literature…
Descriptors: Attention Deficit Disorders, Attention Span, Behavior Rating Scales, Children
Heath, Edward M.; Coleman, Karen J.; Lensegrav, Tera L.; Fallon, Jennifer A. – Research Quarterly for Exercise and Sport, 2006
The System for Observing Fitness Instruction Time (SOFIT) is a direct observation system specifically developed for use during physical education (PE; McKenzie, 1991; McKenzie, Sallis, & Nader, 1991). The purpose of this study was to validate the estimates of time spent in various physical activity intensities obtained with the paper and pencil…
Descriptors: Validity, Physical Activities, Physical Education, Physical Fitness
Rudner, Lawrence M.; Garcia, Veronica; Welch, Catherine – Journal of Technology, Learning, and Assessment, 2006
This report provides a two-part evaluation of the IntelliMetric[SM] automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test[TM] (GMAT[TM]). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays
Bahadourian, Ara John; Tam, Kai Yung; Greer, R. Douglas; Rousseau, Marilyn K. – International Journal of Behavioral Consultation and Therapy, 2006
We report an experiment examining the academic performance of undergraduate students in two special education college courses. The experimenter/professor taught both courses in which he presented curriculum material via written learn units (LUs) (Greer & Hogin, 1999) or in a lecture format across randomly selected weeks in a 12-week semester.…
Descriptors: Undergraduate Students, Academic Achievement, Special Education, Education Courses
McGhee, Debbie E.; Lowell, Nana – New Directions for Teaching and Learning, 2003
This study compares mean ratings, inter-rater reliabilities, and the factor structure of items for online and paper student-rating forms from the University of Washington's Instructional Assessment System. (Contains 3 figures and 2 tables.)
Descriptors: Psychometrics, Factor Structure, Student Evaluation of Teacher Performance, Test Items
New Mexico Public Education Department, 2007
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007
This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures
Janosik, Steven M. – NASPA Journal, 2007
Most conversations about ethics and professional behavior involve case studies and hypothetical situations. This study identifies and examines the most common concerns in professional behavior as reported by 303 student affairs practitioners in the field. Differences by gender, years of experience, organizational level, institutional type, and…
Descriptors: Ethics, Professional Personnel, Behavior, Antisocial Behavior
Stahl, John; And Others – 1996
On-line performance assessment was developed to maximize the usefulness of performance assessment and to minimize the time and labor costs incurred. This paper reports on the development of an on-line performance assessment instrument, focusing on the establishment and validation of the scoring rubric and its implementation in the Rasch model, the…
Descriptors: Computer Software, Computer Software Development, Cost Effectiveness, Interrater Reliability

Direct link
