Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 5 |
Descriptor
| Interrater Reliability | 10 |
| Test Format | 10 |
| Test Validity | 10 |
| Test Reliability | 6 |
| Language Tests | 4 |
| Scoring | 4 |
| Test Construction | 4 |
| English (Second Language) | 3 |
| Second Language Learning | 3 |
| Task Analysis | 3 |
| Test Use | 3 |
| More ▼ | |
Source
Author
| Alderson, J. Charles | 1 |
| Alweis, Richard L. | 1 |
| Davis, Larry | 1 |
| Donato, Anthony A. | 1 |
| Edward Paul Getman | 1 |
| Fitzpatrick, Caroline | 1 |
| Kinicki, Angelo J. | 1 |
| Ma, Yanxia A. | 1 |
| Marx, Brian D. | 1 |
| Norris, John | 1 |
| Olafsson, Gestur | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 6 |
| Reports - Research | 5 |
| Guides - Non-Classroom | 2 |
| Dissertations/Theses -… | 1 |
| Information Analyses | 1 |
| Reports - Evaluative | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Elementary Education | 2 |
| Early Childhood Education | 1 |
| Grade 3 | 1 |
| Grade 4 | 1 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| Grade 8 | 1 |
| High Schools | 1 |
| More ▼ | |
Audience
| Administrators | 1 |
| Practitioners | 1 |
| Teachers | 1 |
Location
| China | 1 |
| Colombia | 1 |
| Germany | 1 |
| India | 1 |
| Japan | 1 |
| Jordan | 1 |
| Louisiana | 1 |
| Mexico | 1 |
| New York | 1 |
| South Korea | 1 |
| Turkey | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
| Test of English as a Foreign… | 2 |
| ACT Assessment | 1 |
What Works Clearinghouse Rating
Davis, Larry; Norris, John – ETS Research Report Series, 2021
The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…
Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Alweis, Richard L.; Fitzpatrick, Caroline; Donato, Anthony A. – Journal of Education and Training Studies, 2015
Introduction: The Multiple Mini-Interview (MMI) format appears to mitigate individual rater biases. However, the format itself may introduce structural systematic bias, favoring extroverted personality types. This study aimed to gain a better understanding of these biases from the perspective of the interviewer. Methods: A sample of MMI…
Descriptors: Interviews, Interrater Reliability, Qualitative Research, Semi Structured Interviews
New York State Education Department, 2014
This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…
Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation
Peer reviewedWaters, L. K.; And Others – Perceptual and Motor Skills, 1982
Multitrait-multimethod analysis was performed on instructors' ratings from behaviorally anchored rating scales, graphic rating scales, and mixed standard scales. Two samples of 100 undergraduate students were distinguished on the basis of whether the statements on the mixed-standard scale were behaviorally specific or more generic descriptions of…
Descriptors: Behavior Rating Scales, Discriminant Analysis, Higher Education, Interrater Reliability
Alderson, J. Charles; And Others – 1995
The guide is intended for teachers who must construct language tests and for other professionals who may need to construct, evaluate, or use the results of language tests. Most examples are drawn from the field of English-as-a-Second-Language instruction in the United Kingdom, but the principles and practices described may be applied to the…
Descriptors: Educational Trends, English (Second Language), Interrater Reliability, Language Tests
Peer reviewedKinicki, Angelo J.; And Others – Educational and Psychological Measurement, 1985
Using both the Behaviorally Anchored Rating Scales (BARS) and the Purdue University Scales, 727 undergraduates rated 32 instructors. The BARS had less halo effect, more leniency error, and lower interrater reliability. Both formats were valid. The two tests did not differ in rate discrimination or susceptibility to rating bias. (Author/GDC)
Descriptors: Behavior Rating Scales, College Faculty, Comparative Testing, Higher Education
Peer reviewedTurner, Jean – Annual Review of Applied Linguistics, 1998
This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…
Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews
Florida State Dept. of Education, Tallahassee. Div. of Vocational, Adult, and Community Education. – 1991
This packet contains a manual and a workbook for developing performance tests in vocational education. The manual gives an in-depth description of how to develop, score, and use performance tests. It includes the following sections: definitions of performance testing, steps in developing a performance test, selecting a performance development…
Descriptors: Interrater Reliability, Performance Tests, Postsecondary Education, Scoring

Direct link
