NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024
Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…
Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Julia Brochey-Taylor; Joseph A. Taylor – Educational Research and Reviews, 2024
The purpose of this synthesis study was to assess the reliability and validity of the Draw-A-Scientist Test (DAST) and its variations across multiple studies, aiming to understand limitations and propose modifications for future application within and beyond the science domain. Given the existence of multiple DAST versions, this study quantified…
Descriptors: Cognitive Tests, Freehand Drawing, Personality Measures, Projective Measures
Peer reviewed Peer reviewed
Direct linkDirect link
Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018
Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…
Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales
Peer reviewed Peer reviewed
Direct linkDirect link
Lambert, Matthew C.; Sointu, Erkko T.; Epstein, Michael H. – International Journal of School & Educational Psychology, 2019
Child assessment practices have undergone, and are continuing to undergo, significant changes. Among the most prominent changes is the movement toward measuring child well-being, in general, and emotional and behavioral strengths, in particular. The Behavioral and Emotional Rating Scale (BERS) is a strength-based instrument which is widely used in…
Descriptors: Behavior Rating Scales, Translation, Psychometrics, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Hatala, Rose; Cook, David A.; Brydges, Ryan; Hawkins, Richard – Advances in Health Sciences Education, 2015
In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected…
Descriptors: Measures (Individuals), Test Validity, Surgery, Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Hermans, Heidi; van der Pas, Femke H.; Evenhuis, Heleen M. – Research in Developmental Disabilities: A Multidisciplinary Journal, 2011
Background: In the last decades several instruments measuring anxiety in adults with intellectual disabilities have been developed. Aim: To give an overview of the characteristics and psychometric properties of self-report and informant-report instruments measuring anxiety in this group. Method: Systematic review of the literature. Results:…
Descriptors: Mental Retardation, Learning Disabilities, Interrater Reliability, Measures (Individuals)
Park, Kevin Neil – Online Submission, 2009
The Rorschach Inkblot Test has been the focus of intense controversy, significantly impacting clinicians who currently rely on Exner's Comprehensive System (CS; Exner, 2003) in clinical and forensic settings. This paper evaluates recent empirical CS research to determine whether or not it reveals lack of scientific merit as some skeptics have…
Descriptors: Psychometrics, Psychological Testing, Psychological Evaluation, Literature Reviews
Peer reviewed Peer reviewed
Meyer, Gregory J. – Psychological Assessment, 1997
In reply to criticism of the Rorschach Comprehensive System (CS) by J. Wood, M. Nezworski, and W. Stejskal (1996), this article presents a meta-analysis of published data indicating that the CS has excellent chance-corrected interrater reliability. It is noted that the erroneous assumptions of Wood et al. make their assertions about validity…
Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity
Peer reviewed Peer reviewed
Wood, James M.; Nezworski, M. Teresa; Stejskal, William J. – Psychological Assessment, 1997
G. Meyer (1997) attempts to refute the present authors' criticisms of the interrater reliability of the Rorschach Comprehensive System (CS) but misrepresents their position and offers a flawed meta-analysis in support of his own. Rorschach proponents need to undertake high-quality replicated studies of CS reliability and validity. (SLD)
Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity
Peer reviewed Peer reviewed
Meyer, Gregory J. – Psychological Assessment, 1997
Replies to Wood et al. and documents limitations of their conclusions about the Rorschach Comprehensive System (CS), supporting Meyer's own meta-analysis, which finds adequate interrater reliability for the CS. (SLD)
Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity
Peer reviewed Peer reviewed
Nevo, Baruch – Journal of Educational Measurement, 1985
A literature review and a proposed means of measuring face validity, a test's appearance of being valid, are presented. Empirical evidence from examinees' perceptions of a college entrance examination support the reliability of measuring face validity. (GDC)
Descriptors: College Entrance Examinations, Evaluation Methods, Evaluators, Foreign Countries
Moore, Alan D.; Young, Suzanne – 1997
As schools move toward performance assessment, there is increasing discussion of using these assessments for accountability purposes. When used for making decisions, performance assessments must meet high standards of validity and reliability. One major source of unreliability in performance assessments is interrater disagreement. In this paper,…
Descriptors: Accountability, Correlation, Elementary Secondary Education, Generalizability Theory
Breland, Hunter M. – 1983
Direct assessment of writing skill, usually considered to be synonymous with assessment by means of writing samples, is reviewed in terms of its history and with respect to evidence of its reliability and validity. Reliability is examined as it is influenced by reader inconsistency, domain sampling, and other sources of error. Validity evidence is…
Descriptors: Essay Tests, Evaluation Needs, Higher Education, Interrater Reliability
Salies, Tania Gastao – 1998
A discussion of the evaluation of writing, particularly in English as a Second Language, argues for a communicative approach reflecting the current approach to language teaching and learning. The movement toward more communication-oriented and more valid language testing is examined briefly, and direct assessment is chosen as the preferred format…
Descriptors: Communicative Competence (Languages), English (Second Language), Evaluation Criteria, Foreign Countries
Peer reviewed Peer reviewed
Turner, Jean – Annual Review of Applied Linguistics, 1998
This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…
Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews