ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	5

Descriptor

Interrater Reliability	16
Test Reliability	16
Testing	16
Test Validity	10
English (Second Language)	7
Foreign Countries	7
Language Tests	6
Test Construction	6
Test Items	6
Scoring	5
Second Language Instruction	5
Second Language Learning	4
Student Evaluation	4
Test Use	4
Examiners	3
Higher Education	3
Language Proficiency	3
Rating Scales	3
Scores	3
Test Format	3
Alternative Assessment	2
College Students	2
Comparative Analysis	2
Construct Validity	2
Difficulty Level	2
More ▼

Source

Academic Medicine	1
Annual Review of Applied…	1
Center for Research on…	1
Chemistry Education Research…	1
Clinical Linguistics &…	1
International Journal of…	1
International Journal of…	1
Language Learning	1
New York State Education…	1
Psychology in the Schools	1
RELC Journal: A Journal of…	1
Thought Currents in English…	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	7
Reports - Evaluative	3
Information Analyses	2
Reports - Descriptive	2
Speeches/Meeting Papers	2
Guides - General	1
Guides - Non-Classroom	1
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Early Childhood Education	1
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

Australia	1
China	1
Japan	1
New York	1
United Kingdom (England)	1
United Kingdom (London)	1

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

Clinical Evaluation of…	1
International English…	1
Raven Progressive Matrices	1
Strengths and Difficulties…	1
Test of English as a Foreign…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021

Peer reviewed

Direct link

Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023

Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…

Descriptors: Chemistry, Periodicals, Journal Articles, Science Education

ITC Guidelines for the Large-Scale Assessment of Linguistically and Culturally Diverse Populations

Peer reviewed

Direct link

International Journal of Testing, 2019

These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…

Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage

Dynamic Assessment of Sentence Structure (DASS): Design and Evaluation of a Novel Procedure for the Assessment of Syntax in Children with Language Impairments

Peer reviewed

Direct link

Hasson, Natalie; Dodd, Barbara; Botting, Nicola – International Journal of Language & Communication Disorders, 2012

Background: Sentence construction and syntactic organization are known to be poor in children with specific language impairments (SLI), but little is known about the way in which children with SLI approach language tasks, and static standardized tests contribute little to the differentiation of skills within the population of children with…

Descriptors: Alternative Assessment, Sentence Structure, Syntax, Language Processing

Diagnosing the English Speaking Ability of College Students in China -- Validation of the Diagnostic College English Speaking Test

Direct link

Zhao, Zhongbao – RELC Journal: A Journal of Language Teaching and Research, 2013

This study investigates the validity of the Diagnostic College English Speaking Test (DCEST) in the context of EFL teaching and learning in China. The experiment was conducted in three stages over the course of eight weeks at a national key university in China. By means of test administration and questionnaire survey, the researcher gathered…

Descriptors: Oral Language, Construct Validity, Language Tests, Diagnostic Tests

New York State Alternate Assessment Technical Report, 2013-14

Download full text

New York State Education Department, 2014

This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…

Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation

Sampling of Common Items: An Unrecognized Source of Error in Test Equating. CSE Report 636

Download full text

Michaelides, Michalis P.; Haertel, Edward H. – Center for Research on Evaluation Standards and Student Testing CRESST, 2004

There is variability in the estimation of an equating transformation because common-item parameters are obtained from responses of samples of examinees. The most commonly used standard error of equating quantifies this source of sampling error, which decreases as the sample size of examinees used to derive the transformation increases. In a…

Descriptors: Test Items, Testing, Error Patterns, Interrater Reliability

A Reliability Study of BDAE-3 Discourse Coding

Peer reviewed

Direct link

Powell, Thomas W. – Clinical Linguistics & Phonetics, 2006

The third edition of the "Boston Diagnostic Aphasia Examination" (Goodglass, Kaplan, and Barresi) introduced standardized procedures for coding discourse samples elicited using the well known Cookie Theft illustration. To evaluate the reliability of this discourse coding procedure, a transcribed sample was coded by 14 novice examiners…

Descriptors: Examiners, Interrater Reliability, Test Reliability, Aphasia

Assessment of Technical Aspects of WISC-R Administration.

Peer reviewed

Stewart, Krista J. – Psychology in the Schools, 1987

Evaluated the technical aspects of three Wechsler Intelligence Scale for Children-Revised (WISC-R) administrations of five psychology graduate students using the WISC-R Administration Observational Checklist (WAOC) to evaluate interrater agreement. Students performed significantly better on the second than on the first observation, with…

Descriptors: Educational Diagnosis, Error Patterns, Examiners, Graduate Students

Exploring Rater Behaviour with Rasch Techniques.

Download full text

McNamara, T. F.; Adams, R. J. – 1991

A preliminary study is reported of the use of new multifaceted Rasch measurement mechanisms for investigating rater characteristics in language testing. Ratings from four judges of scripts from 50 candidates taking the International English Language Testing System test, a test of English for Academic Purposes, are analyzed. The analysis…

Descriptors: Comparative Analysis, English (Second Language), Foreign Countries, Interrater Reliability

The Triple-Jump Examination as an Assessment Tool in the Problem-Based Medical Curriculum at the University of Hawaii.

Peer reviewed

Smith, Richard Merrill – Academic Medicine, 1993

A University of Hawaii study compared objective and subjective assessments of the three-step triple jump examination which tests medical students' clinical problem-solving processes. Subjects were 58 first-year students. Results found the subjective assessments were more consistent across problems of varying difficulty level than were objective…

Descriptors: Case Studies, Difficulty Level, Higher Education, Interrater Reliability

Characteristics of the Test Components of the IELTS Battery: Australian Trial Data.

Download full text

Griffin, Patrick – 1990

Results of the International English Language Testing System (IELTS) battery trials in Australia are reported. The IELTS tests of productive language skills use direct assessment strategies and subjective scoring according to detailed guidelines. The receptive skills tests use indirect assessment strategies and clerical scoring procedures.…

Descriptors: English (Second Language), Foreign Countries, Grammar, Interrater Reliability

Language Test Construction and Evaluation.

Alderson, J. Charles; And Others – 1995

The guide is intended for teachers who must construct language tests and for other professionals who may need to construct, evaluate, or use the results of language tests. Most examples are drawn from the field of English-as-a-Second-Language instruction in the United Kingdom, but the principles and practices described may be applied to the…

Descriptors: Educational Trends, English (Second Language), Interrater Reliability, Language Tests

Towards Communicative Measurement of Writing: Where Are We Now?

Download full text

Salies, Tania Gastao – 1998

A discussion of the evaluation of writing, particularly in English as a Second Language, argues for a communicative approach reflecting the current approach to language teaching and learning. The movement toward more communication-oriented and more valid language testing is examined briefly, and direct assessment is chosen as the preferred format…

Descriptors: Communicative Competence (Languages), English (Second Language), Evaluation Criteria, Foreign Countries

Assessing Speaking.

Peer reviewed

Turner, Jean – Annual Review of Applied Linguistics, 1998

This review of research on second-language oral testing outlines the nature of early research in interview-format proficiency testing, then reports on new directions in investigation of construct validity of interview-format and other oral skills tests through examination of examinee, interviewer, and rater performance. Research on empirically…

Descriptors: Construct Validity, Educational Trends, Interrater Reliability, Interviews

A Survey of Issues and Item Writing in Language Testing.

Download full text

Strong, Gregory – Thought Currents in English Literature, 1995

This paper traces developments in educational psychology and measurement that led to the Test of English as a Foreign Language (TOEFL) and the test of English for International Communication (TOEIC) and the application of educational measurement terms such as validity and reliability to testing. Use of a table of specifications for planning…

Descriptors: Cloze Procedure, Difficulty Level, English (Second Language), Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2

Adams, R. J.	1
Alderson, J. Charles	1
Botting, Nicola	1
Dodd, Barbara	1
Griffin, Patrick	1
Haertel, Edward H.	1
Hasson, Natalie	1
Komperda, Regis	1
Lazenby, Katherine	1
Marcroft, Tina A.	1
McNamara, T. F.	1
Michaelides, Michalis P.	1
Polio, Charlene G.	1
Powell, Thomas W.	1
Salies, Tania Gastao	1
Smith, Richard Merrill	1
Stewart, Krista J.	1
Strong, Gregory	1
Tenney, Kristin	1
Turner, Jean	1
Zhao, Zhongbao	1
More ▼