NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 481 to 495 of 503 results Save | Export
Peer reviewed Peer reviewed
Loughran, Sandra B. – Early Childhood Education Journal, 2003
Investigated the agreement and stability of three teacher rating scales used to assess attention deficit/hyperactivity disorder (ADHD) in preschoolers. Found that agreement among the rating scales and interrater agreement between teacher and assistant teacher ratings yielded noticeably stronger correlations at elementary school than at preschool 4…
Descriptors: At Risk Persons, Attention Deficit Disorders, Behavior Rating Scales, Early Childhood Education
Sullivan, Francis J. – 1986
A study examined how pragmatic form influences evaluation of student essays in university placement testing. Specifically, the study documented how patterns in students' use of information (assumed to be either old, inferable, or new for readers) affected the holistic scores for quality given to the essays. Subjects, 99 randomly selected entering…
Descriptors: College Freshmen, Essay Tests, Evaluation Criteria, Evaluation Methods
Perkins, Kyle – 1986
Based on the premise that composition skills and their evaluation are crucial to the educational process, this paper presents a tentative research program for conducting future English as a second language (ESL) composition evaluation studies. The program developed in the paper covers the following topics as areas which merit further rigorous…
Descriptors: Elementary Secondary Education, English (Second Language), Error Analysis (Language), Evaluation Criteria
Bejar, Isaac I. – 1985
The feasibility of reducing scoring costs for the Test of Spoken English (TSE) by using one rater was investigated. Currently, two raters are used. It was found that, because of the possibility of different standards used by potential raters, it does not appear feasible to use a single rater as the sole determiner of speaking proficiency under the…
Descriptors: Analysis of Covariance, Cost Effectiveness, English (Second Language), Evaluation Criteria
Peer reviewed Peer reviewed
Turner, Jean – Annual Review of Applied Linguistics, 1998
This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…
Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews
Strahan, David B.; Van Hoose, John – 1986
The Invitational Teaching Observation Instrument was developed to extend effective teaching through self-assessment and clinical supervision. Based on the theories of Invitational Education, this test analyzed both personal and professional dimensions of teaching. Items reflected research on effective teaching and were cross-validated with two…
Descriptors: Behavior Rating Scales, Classroom Observation Techniques, Elementary School Teachers, Evaluation Criteria
Gearhart, Maryl; Novak, John R.; Herman, Joan L. – 1994
Technical questions regarding the reliability and validity of large-scale portfolio assessment were studied which focused on: (1) whether raters can score collections of writing reliably with rubrics designed for single samples; (2) whether ratings derived from different frameworks differ in their capacities to support technically sound…
Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Essay Tests
Srebnik, Debra – 1996
This paper discusses the results of a study that investigated the validity and reliability of the Ecology Rating Scale (ERS). The ERS is a brief, multi-dimensional level-of-functioning instrument that can be rated by parents or clinicians. The ERS is comprised of seven domains of youth functioning: family, school, emotional, legal/justice,…
Descriptors: Academic Achievement, Adolescents, Behavior Disorders, Child Health
Florida State Dept. of Education, Tallahassee. Div. of Vocational, Adult, and Community Education. – 1991
This packet contains a manual and a workbook for developing performance tests in vocational education. The manual gives an in-depth description of how to develop, score, and use performance tests. It includes the following sections: definitions of performance testing, steps in developing a performance test, selecting a performance development…
Descriptors: Interrater Reliability, Performance Tests, Postsecondary Education, Scoring
Shavelson, Richard J.; And Others – 1993
In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…
Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators
North Carolina State Dept. of Public Instruction, Raleigh. Div. of Accountability/Testing. – 2001
During 1999-2000 school year, the North Carolina Alternate Assessment Portfolio was administered to eligible students with serious cognitive deficits statewide as a pilot program. This report provides state, regional, and local education agency results of that pilot program. The purpose of the pilot was to review the feasibility, validity, and…
Descriptors: Academic Achievement, American Indians, Cultural Differences, Elementary Secondary Education
Strong, Gregory – Thought Currents in English Literature, 1995
This paper traces developments in educational psychology and measurement that led to the Test of English as a Foreign Language (TOEFL) and the test of English for International Communication (TOEIC) and the application of educational measurement terms such as validity and reliability to testing. Use of a table of specifications for planning…
Descriptors: Cloze Procedure, Difficulty Level, English (Second Language), Foreign Countries
Carlson, Sybil B.; And Others – 1985
Four writing samples were obtained from 638 foreign college applicants who represented three major foreign language groups (Arabic, Chinese, and Spanish), and from 60 native English speakers. All four were scored holistically, two were also scored for sentence-level and discourse-level skills, and some were scored by the Writer's Workbench…
Descriptors: Arabic, Chinese, College Entrance Examinations, Computer Software
Shiflett, Samuel; And Others – 1985
A study was undertaken to improve the measurement of small team performance within the Army. A provisional taxonomy of team-level performance functions was field-validated; criteria and measures of the functions were developed; and their reliability was examined. The provisional taxonomy, used for observing Army field training exercises, was used…
Descriptors: Behavior Rating Scales, Classification, Evaluation Criteria, Evaluators
Peer reviewed Peer reviewed
Polio, Charlene G. – Language Learning, 1997
Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)
Descriptors: College Students, English (Second Language), Error Analysis (Language), Error of Measurement
Pages: 1  |  ...  |  24  |  25  |  26  |  27  |  28  |  29  |  30  |  31  |  32  |  33  |  34