ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	7

Descriptor

Interrater Reliability	15
Test Validity	15
Test Reliability	7
Literature Reviews	5
Test Use	5
Foreign Countries	4
Evaluation Methods	3
Meta Analysis	3
Psychometrics	3
Scoring	3
College Students	2
Construct Validity	2
Evaluation Research	2
Evaluators	2
Language Tests	2
Measurement Techniques	2
Measures (Individuals)	2
Performance Based Assessment	2
Rating Scales	2
Second Language Instruction	2
Testing	2
Translation	2
Writing Evaluation	2
Accountability	1
Adults	1
More ▼

Source

Psychological Assessment	3
Advances in Health Sciences…	1
Annual Review of Applied…	1
Educational Research and…	1
International Journal of…	1
Interpreter and Translator…	1
Journal of Educational…	1
Language Testing	1
Online Submission	1
Research in Developmental…	1

Publication Type

Information Analyses	15
Journal Articles	11
Reports - Research	4
Reports - Evaluative	3
Dissertations/Theses -…	1
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Postsecondary Education	2

Audience

Location

China	2
Finland	1
Israel	1
Kuwait	1
Lithuania	1
Portugal	1
Romania	1
South Africa	1

Laws, Policies, & Programs

Assessments and Surveys

Rorschach Test	3
Behavioral and Emotional…	1
Draw a Person Test	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Raters' Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalisation

Peer reviewed

Direct link

Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024

Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…

Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability

Synthesizing Validity and Reliability Evidence for the Draw-A-Scientist Test

Peer reviewed
PDF on ERIC

Download full text

Julia Brochey-Taylor; Joseph A. Taylor – Educational Research and Reviews, 2024

The purpose of this synthesis study was to assess the reliability and validity of the Draw-A-Scientist Test (DAST) and its variations across multiple studies, aiming to understand limitations and propose modifications for future application within and beyond the science domain. Given the existence of multiple DAST versions, this study quantified…

Descriptors: Cognitive Tests, Freehand Drawing, Personality Measures, Projective Measures

Validation of Rating Processes within an Argument-Based Framework

Peer reviewed

Direct link

Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018

Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…

Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales

A Comprehensive Review of International Research Using the Behavioral and Emotional Rating Scale

Peer reviewed

Direct link

Lambert, Matthew C.; Sointu, Erkko T.; Epstein, Michael H. – International Journal of School & Educational Psychology, 2019

Child assessment practices have undergone, and are continuing to undergo, significant changes. Among the most prominent changes is the movement toward measuring child well-being, in general, and emotional and behavioral strengths, in particular. The Behavioral and Emotional Rating Scale (BERS) is a strength-based instrument which is widely used in…

Descriptors: Behavior Rating Scales, Translation, Psychometrics, Scores

Constructing a Validity Argument for the Objective Structured Assessment of Technical Skills (OSATS): A Systematic Review of Validity Evidence

Peer reviewed

Direct link

Hatala, Rose; Cook, David A.; Brydges, Ryan; Hawkins, Richard – Advances in Health Sciences Education, 2015

In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected…

Descriptors: Measures (Individuals), Test Validity, Surgery, Skills

Instruments Assessing Anxiety in Adults with Intellectual Disabilities: A Systematic Review

Peer reviewed

Direct link

Hermans, Heidi; van der Pas, Femke H.; Evenhuis, Heleen M. – Research in Developmental Disabilities: A Multidisciplinary Journal, 2011

Background: In the last decades several instruments measuring anxiety in adults with intellectual disabilities have been developed. Aim: To give an overview of the characteristics and psychometric properties of self-report and informant-report instruments measuring anxiety in this group. Method: Systematic review of the literature. Results:…

Descriptors: Mental Retardation, Learning Disabilities, Interrater Reliability, Measures (Individuals)

A Review of Literature Regarding Scientific Controversies Surrounding the Psychometric Properties of the Rorschach Inkblot Test

Download full text

Park, Kevin Neil – Online Submission, 2009

The Rorschach Inkblot Test has been the focus of intense controversy, significantly impacting clinicians who currently rely on Exner's Comprehensive System (CS; Exner, 2003) in clinical and forensic settings. This paper evaluates recent empirical CS research to determine whether or not it reveals lack of scientific merit as some skeptics have…

Descriptors: Psychometrics, Psychological Testing, Psychological Evaluation, Literature Reviews

Assessing Reliability: Critical Corrections for a Critical Examination of the Rorschach Comprehensive System.

Peer reviewed

Meyer, Gregory J. – Psychological Assessment, 1997

In reply to criticism of the Rorschach Comprehensive System (CS) by J. Wood, M. Nezworski, and W. Stejskal (1996), this article presents a meta-analysis of published data indicating that the CS has excellent chance-corrected interrater reliability. It is noted that the erroneous assumptions of Wood et al. make their assertions about validity…

Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity

The Reliability of the Comprehensive System for the Rorschach: A Comment on Meyer (1997).

Peer reviewed

Wood, James M.; Nezworski, M. Teresa; Stejskal, William J. – Psychological Assessment, 1997

G. Meyer (1997) attempts to refute the present authors' criticisms of the interrater reliability of the Rorschach Comprehensive System (CS) but misrepresents their position and offers a flawed meta-analysis in support of his own. Rorschach proponents need to undertake high-quality replicated studies of CS reliability and validity. (SLD)

Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity

Thinking Clearly about Reliability: More Critical Corrections Regarding the Rorschach Comprehensive System.

Peer reviewed

Meyer, Gregory J. – Psychological Assessment, 1997

Replies to Wood et al. and documents limitations of their conclusions about the Rorschach Comprehensive System (CS), supporting Meyer's own meta-analysis, which finds adequate interrater reliability for the CS. (SLD)

Descriptors: Interrater Reliability, Meta Analysis, Test Use, Test Validity

Face Validity Revisited.

Peer reviewed

Nevo, Baruch – Journal of Educational Measurement, 1985

A literature review and a proposed means of measuring face validity, a test's appearance of being valid, are presented. Empirical evidence from examinees' perceptions of a college entrance examination support the reliability of measuring face validity. (GDC)

Descriptors: College Entrance Examinations, Evaluation Methods, Evaluators, Foreign Countries

Clarifying the Blurred Image: Estimating the Inter-Rater Reliability of Performance Assessments.

Download full text

Moore, Alan D.; Young, Suzanne – 1997

As schools move toward performance assessment, there is increasing discussion of using these assessments for accountability purposes. When used for making decisions, performance assessments must meet high standards of validity and reliability. One major source of unreliability in performance assessments is interrater disagreement. In this paper,…

Descriptors: Accountability, Correlation, Elementary Secondary Education, Generalizability Theory

The Direct Assessment of Writing Skill: A Measurement Review. College Board Report No. 83-6.

Download full text

Breland, Hunter M. – 1983

Direct assessment of writing skill, usually considered to be synonymous with assessment by means of writing samples, is reviewed in terms of its history and with respect to evidence of its reliability and validity. Reliability is examined as it is influenced by reader inconsistency, domain sampling, and other sources of error. Validity evidence is…

Descriptors: Essay Tests, Evaluation Needs, Higher Education, Interrater Reliability

Towards Communicative Measurement of Writing: Where Are We Now?

Download full text

Salies, Tania Gastao – 1998

A discussion of the evaluation of writing, particularly in English as a Second Language, argues for a communicative approach reflecting the current approach to language teaching and learning. The movement toward more communication-oriented and more valid language testing is examined briefly, and direct assessment is chosen as the preferred format…

Descriptors: Communicative Competence (Languages), English (Second Language), Evaluation Criteria, Foreign Countries

Assessing Speaking.

Peer reviewed

Turner, Jean – Annual Review of Applied Linguistics, 1998

This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…

Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews

Meyer, Gregory J.	2
Binghan Zheng	1
Breland, Hunter M.	1
Brydges, Ryan	1
Chao Han	1
Chapelle, Carol A.	1
Cook, David A.	1
Epstein, Michael H.	1
Evenhuis, Heleen M.	1
Hatala, Rose	1
Hawkins, Richard	1
Hermans, Heidi	1
Joseph A. Taylor	1
Julia Brochey-Taylor	1
Knoch, Ute	1
Lambert, Matthew C.	1
Mingqing Xie	1
Moore, Alan D.	1
Nevo, Baruch	1
Nezworski, M. Teresa	1
Park, Kevin Neil	1
Salies, Tania Gastao	1
Shirong Chen	1
Sointu, Erkko T.	1
Stejskal, William J.	1
More ▼