ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	5

Descriptor

Interrater Reliability	10
Test Format	10
Test Validity	10
Test Reliability	6
Language Tests	4
Scoring	4
Test Construction	4
English (Second Language)	3
Second Language Learning	3
Task Analysis	3
Test Use	3
Testing	3
Animation	2
Behavior Rating Scales	2
Computer Assisted Testing	2
Difficulty Level	2
Educational Trends	2
Elementary School Students	2
Foreign Countries	2
Higher Education	2
Interviews	2
Item Analysis	2
Language Proficiency	2
Rating Scales	2
Response Style (Tests)	2
More ▼

Source

Annual Review of Applied…	1
ETS Research Report Series	1
Educational and Psychological…	1
Journal of Education and…	1
Journal of Educational…	1
New York State Education…	1
Online Submission	1
Perceptual and Motor Skills	1

Author

Alderson, J. Charles	1
Alweis, Richard L.	1
Davis, Larry	1
Donato, Anthony A.	1
Edward Paul Getman	1
Fitzpatrick, Caroline	1
Kinicki, Angelo J.	1
Ma, Yanxia A.	1
Marx, Brian D.	1
Norris, John	1
Olafsson, Gestur	1
Smolinsky, Lawrence	1
Turner, Jean	1
Waters, L. K.	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	5
Guides - Non-Classroom	2
Dissertations/Theses -…	1
Information Analyses	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Elementary Education	2
Early Childhood Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Administrators	1
Practitioners	1
Teachers	1

Location

China	1
Colombia	1
Germany	1
India	1
Japan	1
Jordan	1
Louisiana	1
Mexico	1
New York	1
South Korea	1
Turkey	1
United States	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1
Pell Grant Program	1

Assessments and Surveys

Test of English as a Foreign…	2
ACT Assessment	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Developing an Innovative Elicited Imitation Task for Efficient English Proficiency Assessment. TOEFL® Research Report. RR-96. ETS RR-21-24

Peer reviewed
PDF on ERIC

Download full text

Davis, Larry; Norris, John – ETS Research Report Series, 2021

The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…

Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests

Computer-Based and Paper-and-Pencil Tests: A Study in Calculus for STEM Majors

Peer reviewed

Direct link

Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020

Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…

Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus

Age, Task Characteristics, and Acoustic Indicators of Engagement: Investigations into the Validity of a Technology-Enhanced Speaking Test for Young Language Learners

Download full text

Edward Paul Getman – Online Submission, 2020

Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement

Rater Perceptions of Bias Using the Multiple Mini-Interview Format: A Qualitative Study

Peer reviewed
PDF on ERIC

Download full text

Alweis, Richard L.; Fitzpatrick, Caroline; Donato, Anthony A. – Journal of Education and Training Studies, 2015

Introduction: The Multiple Mini-Interview (MMI) format appears to mitigate individual rater biases. However, the format itself may introduce structural systematic bias, favoring extroverted personality types. This study aimed to gain a better understanding of these biases from the perspective of the interviewer. Methods: A sample of MMI…

Descriptors: Interviews, Interrater Reliability, Qualitative Research, Semi Structured Interviews

New York State Alternate Assessment Technical Report, 2013-14

Download full text

New York State Education Department, 2014

This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…

Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation

Multitrait-Multimethod Analysis of Three Rating Formats.

Peer reviewed

Waters, L. K.; And Others – Perceptual and Motor Skills, 1982

Multitrait-multimethod analysis was performed on instructors' ratings from behaviorally anchored rating scales, graphic rating scales, and mixed standard scales. Two samples of 100 undergraduate students were distinguished on the basis of whether the statements on the mixed-standard scale were behaviorally specific or more generic descriptions of…

Descriptors: Behavior Rating Scales, Discriminant Analysis, Higher Education, Interrater Reliability

Language Test Construction and Evaluation.

Alderson, J. Charles; And Others – 1995

The guide is intended for teachers who must construct language tests and for other professionals who may need to construct, evaluate, or use the results of language tests. Most examples are drawn from the field of English-as-a-Second-Language instruction in the United Kingdom, but the principles and practices described may be applied to the…

Descriptors: Educational Trends, English (Second Language), Interrater Reliability, Language Tests

Behaviorally Anchored Rating Scales vs. Summated Rating Scales: Psychometric Properties and Susceptibility to Rating Bias.

Peer reviewed

Kinicki, Angelo J.; And Others – Educational and Psychological Measurement, 1985

Using both the Behaviorally Anchored Rating Scales (BARS) and the Purdue University Scales, 727 undergraduates rated 32 instructors. The BARS had less halo effect, more leniency error, and lower interrater reliability. Both formats were valid. The two tests did not differ in rate discrimination or susceptibility to rating bias. (Author/GDC)

Descriptors: Behavior Rating Scales, College Faculty, Comparative Testing, Higher Education

Assessing Speaking.

Peer reviewed

Turner, Jean – Annual Review of Applied Linguistics, 1998

This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…

Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews

Performance Testing Manual and Workbook for Vocational Education Administrators and Teachers.

Florida State Dept. of Education, Tallahassee. Div. of Vocational, Adult, and Community Education. – 1991

This packet contains a manual and a workbook for developing performance tests in vocational education. The manual gives an in-depth description of how to develop, score, and use performance tests. It includes the following sections: definitions of performance testing, steps in developing a performance test, selecting a performance development…

Descriptors: Interrater Reliability, Performance Tests, Postsecondary Education, Scoring