ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	3

Descriptor

Comparative Analysis	15
Interrater Reliability	15
Testing	15
Language Tests	7
Foreign Countries	5
Rating Scales	5
Second Language Learning	5
English (Second Language)	4
Interviews	4
Grading	3
Language Proficiency	3
Oral Language	3
Student Attitudes	3
Test Format	3
Test Items	3
Writing Evaluation	3
Case Studies	2
College Students	2
Communicative Competence…	2
Computer Assisted Testing	2
Essays	2
Evaluation Criteria	2
Evaluation Methods	2
Handwriting	2
Higher Education	2
More ▼

Source

Assessing Writing	2
ALT-J: Research in Learning…	1
Center for Research on…	1
Creativity Research Journal	1
Journal of Communication…	1
Journal of Counseling…	1
Language Testing	1

Publication Type

Reports - Research	10
Speeches/Meeting Papers	7
Journal Articles	6
Reports - Evaluative	3
Collected Works - Serials	1
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	3
Postsecondary Education	3
High Schools	1
Secondary Education	1

Audience

Practitioners	1
Teachers	1

Location

California	1
Canada	1
Taiwan	1
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 15 results Save | Export

A Comparison of Assessment Methods and Raters in Product Creativity

Peer reviewed

Direct link

Lu, Chia-Chen; Luh, Ding-Bang – Creativity Research Journal, 2012

Although previous studies have attempted to use different experiences of raters to rate product creativity by adopting the Consensus Assessment Method (CAT) approach, the validity of replacing CAT with another measurement tool has not been adequately tested. This study aimed to compare raters with different levels of experience (expert ves.…

Descriptors: Creativity, Interrater Reliability, Construct Validity, Comparative Analysis

Playing with the Stakes: A Consideration of an Aspect of the Social Context of a Gatekeeping Writing Assessment

Peer reviewed

Direct link

Baker, Beverly A. – Assessing Writing, 2010

In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…

Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing

Typing Compared with Handwriting for Essay Examinations at University: Letting the Students Choose

Peer reviewed

Direct link

Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael – ALT-J: Research in Learning Technology, 2010

Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…

Descriptors: Handwriting, Essay Tests, Interrater Reliability, Grading

Sampling of Common Items: An Unrecognized Source of Error in Test Equating. CSE Report 636

Download full text

Michaelides, Michalis P.; Haertel, Edward H. – Center for Research on Evaluation Standards and Student Testing CRESST, 2004

There is variability in the estimation of an equating transformation because common-item parameters are obtained from responses of samples of examinees. The most commonly used standard error of equating quantifies this source of sampling error, which decreases as the sample size of examinees used to derive the transformation increases. In a…

Descriptors: Test Items, Testing, Error Patterns, Interrater Reliability

A Method To Compare Rater Severity across Several Administrations.

Download full text

O'Neill, Thomas R.; Lunz, Mary E. – 1997

This paper illustrates a method to study rater severity across exam administrations. A multi-facet Rasch model defined the ratings as being dominated by four facets: examinee ability, rater severity, project difficulty, and task difficulty. Ten years of data from administrations of a histotechnology performance assessment were pooled and analyzed…

Descriptors: Ability, Comparative Analysis, Equated Scores, Interrater Reliability

Comparison of Thought-Listing Rating Methods.

Peer reviewed

Tarico, Valerie S.; And Others – Journal of Counseling Psychology, 1986

Compared three methods of rating thoughts: self-rating by subjects, rating by experts with thoughts presented randomly, and rating by experts with thoughts presented in context among 107 students who listed their thoughts prior to giving a speech. Results indicated all three methods were equal in predictions of speech anxiety and performance.…

Descriptors: Anxiety, Cognitive Measurement, Cognitive Processes, Comparative Analysis

Evaluating the Efficacy of Rater Self-Training.

Download full text

Kenyon, Dorry; Stansfield, Charles W. – 1993

This paper examines whether individuals who train themselves to score a performance assessment will rate acceptably when compared to known standards. Research on the efficacy of rater self-training materials developed by the Center for Applied Linguistics for the Texas Oral Proficiency Test (TOPT) is examined. Rater self-materials are described…

Descriptors: Bilingual Education, Comparative Analysis, Evaluators, Individual Characteristics

Language Testing: Recent Developments and Persistent Dilemmas.

Download full text

Takala, Sauli – 1998

This paper discusses recent developments in language testing. It begins with a review of the traditional criteria that are applied to all measurement and outlines recent emphases that derive from the expanding range of stakeholders. Drawing on Alderson's seminal work, criteria are presented for evaluating communicative language tests. Developments…

Descriptors: Alternative Assessment, Communicative Competence (Languages), Comparative Analysis, Evaluation Criteria

Exploring Rater Behaviour with Rasch Techniques.

Download full text

McNamara, T. F.; Adams, R. J. – 1991

A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for investigating rater characteristics in language testing. Ratings from four judges of scripts from 50 candidates taking the International English Language Testing System test, a test of English for Academic Purposes, are analyzed. The analysis…

Descriptors: Comparative Analysis, English (Second Language), Foreign Countries, Interrater Reliability

Analysis of Intermediate-Level Japanese Learners' Speech: An Attempt To Clarify the ACTFL-OPI Criteria.

PDF pending restoration

Hori, Utako; Ito, Tokumi; Kitazawa, Mieko; Masuda, Masako; Ogiwara, Chikako; Saito, Mariko; Yoneda, Yukiyo – 1996

A group of seven Japanese-language Oral Proficiency Interview (OPI) testers licensed by the American Council on the Teaching of foreign Languages (ACTFL) conducted research related to ACTFL-OPI criteria. They first examined 24 audiotaped interview tests to see what kind of consistency there would be when individual testers applied general criteria…

Descriptors: Audiotape Recordings, Comparative Analysis, Foreign Countries, Grammar

An Investigation of Planning Time and Proficiency Level on Oral Test Discourse.

Peer reviewed

Wigglesworth, Gillian – Language Testing, 1997

In this study, planning time was manipulated as a variable in a trial administration of a semi-direct oral interaction test. Discourse analytic techniques were used to determine the nature and/or significance of difference in the elicited discourse across two conditions in terms of complexity and accuracy. Findings suggest that planning time may…

Descriptors: Cognitive Development, Communicative Competence (Languages), Comparative Analysis, Discourse Analysis

Involving Factors of Fairness in Language Testing.

Download full text

Nakamura, Yuji – Journal of Communication Studies, 1997

This study investigated the effects of three aspects of language testing (test task, familiarity with an interviewer, and test method) on both tester and tested. Data were drawn from several previous studies by the researcher. Concerning test task, data were analyzed for the type of topic students wanted most to talk about or preferred not to talk…

Descriptors: Behavior Patterns, Comparative Analysis, English (Second Language), Interrater Reliability

A Comparative Study of ESL Writers' Performance in a Paper-Based and a Computer-Delivered Writing Test

Peer reviewed

Direct link

Lee, H. K. – Assessing Writing, 2004

This study aimed to comprehensively investigate the impact of a word-processor on an ESL writing assessment, covering comparison of inter-rater reliability, the quality of written products, the writing process across different testing occasions using different writing media, and students' perception of a computer-delivered test. Writing samples of…

Descriptors: Writing Evaluation, Student Attitudes, Writing Tests, Testing

Hidden Expectations: Faculty Perceptions of SLA and ESL Writing Competence.

Download full text

Russikoff, Karen A. – 1994

Problems inherent in the holistic scoring of essay examinations written by limited-English-speakers are examined, particularly in the context of one California state college in which English writing skills, holistically assessed, are required for graduation. These problems include lack of interrater reliability, raters' perceptions of their role,…

Descriptors: Case Studies, College Faculty, College Instruction, Comparative Analysis

Performance Assessment and the Components of the Oral Construct across Different Tasks and Rater Groups.

Download full text

Chalhoub-Deville, Micheline – 1993

This study investigated whether different groups of native speakers assess second language learners' language skills differently for three elicitation techniques. Subjects were six learners of college-level Arabic as a second language, tape-recorded performing three tasks: participating in a modified oral proficiency interview, narrating a picture…

Descriptors: Arabic, College Students, Comparative Analysis, Higher Education

Adams, R. J.	1
Baker, Beverly A.	1
Burk, John	1
Chalhoub-Deville, Micheline	1
Haertel, Edward H.	1
Hori, Utako	1
Ito, Tokumi	1
Kenyon, Dorry	1
Kitazawa, Mieko	1
Lee, H. K.	1
Lu, Chia-Chen	1
Luh, Ding-Bang	1
Lunz, Mary E.	1
Masuda, Masako	1
McNamara, T. F.	1
Michaelides, Michalis P.	1
Mogey, Nora	1
Nakamura, Yuji	1
O'Neill, Thomas R.	1
Ogiwara, Chikako	1
Paterson, Jessie	1
Purcell, Michael	1
Russikoff, Karen A.	1
Saito, Mariko	1
More ▼