NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 3,076 to 3,090 of 3,122 results Save | Export
Shavelson, Richard J.; And Others – 1993
In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…
Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators
Myford, Carol M. – 1991
The aesthetic judgments of experts (casting directors and high school drama teachers), theater buffs, and novices were compared as they rated high school students' videotaped performances of Shakespearean monologues. It was hypothesized that theater buffs would represent an intermediate stage on the path to developing expertise in judging acting…
Descriptors: Ability, Acting, Aesthetic Values, Art Criticism
Silvestro, John R.; And Others – 1989
The job analysis procedures used in the development of the Illinois Certification Testing System are described. The degree of congruence between job analysis ratings provided by public school educators (PSEs) and teacher educators (TEs) who completed the job analysis surveys is examined. National Evaluation Systems, Inc., and the Illinois State…
Descriptors: Comparative Analysis, Content Analysis, Elementary Secondary Education, Interrater Reliability
Telese, James A.; Kulm, Gerald – 1995
A team of university and public school mathematics educators designed performance-based mathematics assessment tasks designed to align with the Texas Assessment of Academic Skills for 93 students who had been identified as at-risk in mathematics. Scenarios were developed based on four contexts: (1) familiar activity; (2) social issue; (3)…
Descriptors: Analysis of Variance, Context Effect, Educational Assessment, Educational Environment
North Carolina State Dept. of Public Instruction, Raleigh. Div. of Accountability/Testing. – 2001
During 1999-2000 school year, the North Carolina Alternate Assessment Portfolio was administered to eligible students with serious cognitive deficits statewide as a pilot program. This report provides state, regional, and local education agency results of that pilot program. The purpose of the pilot was to review the feasibility, validity, and…
Descriptors: Academic Achievement, American Indians, Cultural Differences, Elementary Secondary Education
1999
This document contains four symposium papers on assessing employee performance. In "Influence of Liking and Similarity on Multi-rater Proficiency Ratings of Managerial Competencies" (Reid A. Bates), the pattern of correlations identified between raters, independent variables, and different competencies suggests that raters may react…
Descriptors: Adult Education, Case Studies, Competence, Educational Needs
Peer reviewed Peer reviewed
Tinsley, Barbara J.; And Others – Educational and Psychological Measurement, 1997
The convergent validity of peer, self, and teacher methods of assessing youths' risk propensity and the relation of these measures to health risk behavior were studied with 436 elementary and junior high school students. Findings demonstrate low congruence between rater sources. Prediction depended on behavior assessed and grade level. (SLD)
Descriptors: Age Differences, Behavior Patterns, Children, Elementary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Yongwei; Buckendahl, Chad W.; Juszkiewicz, Piotr J.; Bhola, Dennison S. – Journal of Applied Testing Technology, 2005
With the continual progress of computer technologies, computer automated scoring (CAS) has become a popular tool for evaluating writing assessments. Research of applications of these methodologies to new types of performance assessments is still emerging. While research has generally shown a high agreement of CAS system generated scores with those…
Descriptors: Scoring, Validity, Interrater Reliability, Comparative Analysis
Strong, Gregory – Thought Currents in English Literature, 1995
This paper traces developments in educational psychology and measurement that led to the Test of English as a Foreign Language (TOEFL) and the test of English for International Communication (TOEIC) and the application of educational measurement terms such as validity and reliability to testing. Use of a table of specifications for planning…
Descriptors: Cloze Procedure, Difficulty Level, English (Second Language), Foreign Countries
Carlson, Sybil B.; And Others – 1985
Four writing samples were obtained from 638 foreign college applicants who represented three major foreign language groups (Arabic, Chinese, and Spanish), and from 60 native English speakers. All four were scored holistically, two were also scored for sentence-level and discourse-level skills, and some were scored by the Writer's Workbench…
Descriptors: Arabic, Chinese, College Entrance Examinations, Computer Software
Shiflett, Samuel; And Others – 1985
A study was undertaken to improve the measurement of small team performance within the Army. A provisional taxonomy of team-level performance functions was field-validated; criteria and measures of the functions were developed; and their reliability was examined. The provisional taxonomy, used for observing Army field training exercises, was used…
Descriptors: Behavior Rating Scales, Classification, Evaluation Criteria, Evaluators
Jaeger, Richard M.; Busch, John Christian – 1986
This study explores the use of the modified caution index (MCI) for identifying judges whose patterns of recommendations suggest that their judgments might be based on incomplete information, flawed reasoning, or inattention to their standard-setting tasks. It also examines the effect on test standards and passing rates when the test standards of…
Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation Methods, High Schools
Rose, Andrew M.; And Others – 1985
This third of three volumes reports on analytic procedures conducted to address various aspects of the scalar properties of the Device Effectiveness Forecasting Technique (DEFT). DEFT, a series of microcomputer programs applied to data gathered from rating scales, is used to evaluate simulator devices used in U.S. Army weapons training. The…
Descriptors: Adults, Computer Oriented Programs, Computer Simulation, Data Interpretation
Peer reviewed Peer reviewed
Polio, Charlene G. – Language Learning, 1997
Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)
Descriptors: College Students, English (Second Language), Error Analysis (Language), Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lustick, David; Sykes, Gary – Education Policy Analysis Archives, 2006
This study investigated the National Board for Professional Teaching Standards' (NBPTS) assessment process in order to identify, quantify, and substantiate learning outcomes from the participants. One hundred and twenty candidates for the Adolescent and Young Adult Science (AYA Science) Certificate were studied over a two-year period using the…
Descriptors: Intervention, National Standards, Young Adults, Program Effectiveness
Pages: 1  |  ...  |  199  |  200  |  201  |  202  |  203  |  204  |  205  |  206  |  207  |  208  |  209