ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	11

Descriptor

Interrater Reliability	23
Test Validity	23
Test Reliability	16
Test Construction	11
Scoring	8
Language Tests	6
Evaluation Methods	5
Test Items	5
Reading Achievement	4
Student Evaluation	4
Correlation	3
Cultural Differences	3
Foreign Countries	3
Interviews	3
Mathematics Achievement	3
Measures (Individuals)	3
Psychometrics	3
Second Language Learning	3
Spanish	3
Testing	3
Academic Achievement	2
Academic Standards	2
Adults	2
Children	2
Classification	2
More ▼

Source

New Mexico Public Education…	2
Afterschool Matters	1
Assessment Update	1
Autism: The International…	1
Developmental Medicine &…	1
Educational Testing Service	1
Journal of Deaf Studies and…	1
Journal of Research on…	1
Journal of Teacher Education	1
Language Learning in Higher…	1
Language Testing	1
Modern Language Journal	1
OECD Publishing (NJ1)	1
Phi Delta Kappan	1
Psychological Assessment	1
Regional Educational…	1
Research Papers in Education	1
Thought Currents in English…	1
More ▼

Publication Type

Reports - Descriptive	23
Journal Articles	14
Numerical/Quantitative Data	4
Tests/Questionnaires	3
Guides - Non-Classroom	1
Reports - Evaluative	1
Reports - Research	1

Education Level

Elementary Secondary Education	4
Postsecondary Education	3
Higher Education	2
Early Childhood Education	1
Elementary Education	1
Grade 1	1
Grade 3	1
Primary Education	1

Audience

Location

New Mexico	2
Arizona	1
Georgia	1
Ireland (Dublin)	1
Japan	1

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Test of English as a Foreign…	2
Program for International…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

The Reliability and Consequential Validity of Two Teacher-Administered Student Mathematics Diagnostic Assessments. Study Snapshot. REL 2020-039

Peer reviewed
PDF on ERIC

Download full text

Regional Educational Laboratory Southeast, 2020

Teachers need to assess their students' current level of mathematical understanding to provide appropriate interventions for students who are struggling. Several school districts in Georgia currently use two assessments for this purpose--the Global Strategy Stage (GloSS) and the Individual Knowledge Assessment of Number (IKAN). The IKAN is…

Descriptors: Mathematics Tests, Diagnostic Tests, Test Reliability, Test Validity

Measuring Program Quality, Part 2: Addressing Potential Cultural Bias in a Rater Reliability Exam

Peer reviewed
PDF on ERIC

Download full text

Richer, Amanda; Charmaraman, Linda; Ceder, Ineke – Afterschool Matters, 2018

Like instruments used in afterschool programs to assess children's social and emotional growth or to evaluate staff members' performance, instruments used to evaluate program quality should be free from bias. Practitioners and researchers alike want to know that assessment instruments, whatever their type or intent, treat all people fairly and do…

Descriptors: Cultural Differences, Social Bias, Interrater Reliability, Program Evaluation

Response to "Rating Teachers Cheaper, Faster, and Better: Not so Fast": It's About Evidence

Peer reviewed

Direct link

Gargani, John; Strong, Michael – Journal of Teacher Education, 2015

In Gargani and Strong (2014), we describe The Rapid Assessment of Teacher Effectiveness (RATE), a new teacher evaluation instrument. Our account of the validation research associated with RATE inspired a review by Good and Lavigne (2015). Here, we reply to the main points of their review. We elaborate on the validity, reliability, theoretical…

Descriptors: Evidence, Teacher Effectiveness, Teacher Evaluation, Evaluation Methods

Testing to the Top: Everything But the Kitchen Sink?

Direct link

Dietel, Ron – Phi Delta Kappan, 2011

Two tests intended to measure student achievement of the Common Core State Standards will face intense scrutiny, but the test makers say they will include performance assessments and other items that are not multiple-choice questions. Incorporating performance items on this tests will bring up issues over scoring, costs, and validity.

Descriptors: Student Evaluation, State Standards, Test Construction, Intellectual Property

Marking as Judgment

Peer reviewed

Direct link

Brooks, Val – Research Papers in Education, 2012

An aspect of assessment which has received little attention compared with perennial concerns, such as standards or reliability, is the role of judgment in marking. This paper explores marking as an act of judgment, paying particular attention to the nature of judgment and the processes involved. It brings together studies which have explored…

Descriptors: Educational Assessment, Test Reliability, Test Validity, Value Judgment

Standardising Assessment to Meet Student Needs in Foreign Language Modules in a University Context: Is Standardisation Possible?

Peer reviewed

Direct link

Nunan, Anna – Language Learning in Higher Education, 2014

The Applied Language Centre at University College Dublin offers foreign language modules to students in ten languages at CEFR [Common European Framework of Reference for Languages] levels ranging from A1 to B2. Efforts have been underway in the Centre to standardise the assessment components across languages to ensure parity between module credits…

Descriptors: Second Language Learning, Second Language Instruction, College Students, Standards

Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Download full text

Haberman, Shelby J. – Educational Testing Service, 2011

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

Descriptors: Writing Tests, Scoring, Essays, Language Tests

Development of the Gross Motor Function Classification System (1997)

Peer reviewed

Direct link

Morris, Christopher – Developmental Medicine & Child Neurology, 2008

To address the need for a standardized system to classify the gross motor function of children with cerebral palsy, the authors developed a five-level classification system analogous to the staging and grading systems used in medicine. Nominal group process and Delphi survey consensus methods were used to examine content validity and revise the…

Descriptors: Psychomotor Skills, Children, Test Construction, Content Validity

PISA 2006 Technical Report

Direct link

OECD Publishing (NJ1), 2009

The Organisation for Economic Cooperation and Development's (OECD's) Programme for International Student Assessment (PISA) surveys, which take place every three years, have been designed to collect information about 15-year-old students in participating countries. PISA examines how well students are prepared to meet the challenges of the future,…

Descriptors: Policy Formation, Scaling, Academic Achievement, Interrater Reliability

The Asperger Syndrome (and High-Functioning Autism) Diagnostic Interview (ASDI): A Preliminary Study of a New Structured Clinical Interview.

Peer reviewed

Gillberg, Christopher; Gillberg, Carina; Rastam, Maria; Wentz, Elisabeth – Autism: The International Journal of Research and Practice, 2001

The development of the Asperger Syndrome (and high-functioning autism) Diagnostic Interview (ASDI) is described. Preliminary data from a clinical study of 20 individuals (ages 6-55) suggest that interrater reliability and test-retest stability may be excellent, with kappas exceeding 0.90 in both instances. The validity appears to be relatively…

Descriptors: Adults, Asperger Syndrome, Autism, Children

Suicide Attempt Self-Injury Interview (SASII): Development, Reliability, and Validity of a Scale to Assess Suicide Attempts and Intentional Self-Injury

Peer reviewed

Direct link

Linehan, Marsha M.; Comtois, Katherine Anne; Brown, Milton Z.; Heard, Heidi L.; Wagner, Amy – Psychological Assessment, 2006

The authors describe the development of the Suicide Attempt Self-Injury Interview (SASII), an instrument designed to assess the factors involved in nonfatal suicide attempts and intentional self-injury. Using 4 cohorts of participants, authors generated SASII items and evaluated them with factor and content analyses and internal consistency…

Descriptors: Interrater Reliability, Suicide, Evaluation Methods, Self Destructive Behavior

Assessing the Writing of Deaf College Students: Reevaluating a Direct Assessment of Writing

Peer reviewed

Direct link

Schley, Sara; Albertini, John – Journal of Deaf Studies and Deaf Education, 2005

The NTID Writing Test was developed to assess the writing ability of postsecondary deaf students entering the National Technical Institute for the Deaf and to determine their appropriate placement into developmental writing courses. While previous research (Albertini et al., 1986; Albertini et al., 1996; Bochner, Albertini, Samar, & Metz, 1992)…

Descriptors: Deafness, Writing Ability, Writing Tests, College Students

What's Involved in Collaborative R & D with a School District.

Humes, Ann – 1983

This paper, as an illustration of the procedures involved in a cooperative effort, describes a project in which the Southwest Regional Laboratory (SWRL) designed and developed a minimum standards test in collaboration with a large urban school district in California. The activity described focuses on the writing sample included in the test. The…

Descriptors: High Schools, Institutional Cooperation, Interrater Reliability, Minimum Competency Testing

Development of the Portuguese Speaking Test. Year One Project Report. Development of Semi-Direct Tests of Oral Proficiency in Hausa, Hebrew, Indonesian and Portuguese.

Download full text

Stansfield, Charles W.; Kenyon, Dorry Mann – 1988

The development and validation of a Portuguese oral language test are described. The test consisted of five item types: personal conversation, giving directions, description of picture sequences, topical discourse, and oral task completion based on printed instructions. Three preliminary forms of the test were administered to a group of language…

Descriptors: Interrater Reliability, Interviews, Language Tests, Oral Language

Previous Page | Next Page »

Pages: 1 | 2

Aaron Zimmerman	1
Albertini, John	1
Bethany L. Miller	1
Bradley, Robert H.	1
Brooks, Val	1
Brown, Milton Z.	1
Caldwell, Betty M.	1
Ceder, Ineke	1
Charmaraman, Linda	1
Comtois, Katherine Anne	1
Corwyn, Robert F.	1
Dietel, Ron	1
Edwards, Alison L.	1
Endelman, Ann M.	1
Gargani, John	1
Gillberg, Carina	1
Gillberg, Christopher	1
Grant, Leslie	1
Griph, Gerald W.	1
Haberman, Shelby J.	1
Heard, Heidi L.	1
Humes, Ann	1
Janice Kinghorn	1
Katherine McGuire	1
Kenyon, Dorry Mann	1
More ▼