ERIC - Search Results

Publication Date

In 2026	0
Since 2025	13
Since 2022 (last 5 years)	48
Since 2017 (last 10 years)	151
Since 2007 (last 20 years)	301

Descriptor

Interrater Reliability	503
Test Reliability	503
Test Validity	260
Test Construction	106
Foreign Countries	103
Psychometrics	91
Evaluation Methods	90
Scores	67
Correlation	62
Scoring	61
Rating Scales	58
Measures (Individuals)	54
Student Evaluation	53
Children	49
Adults	40
Measurement Techniques	40
Generalizability Theory	39
Writing Evaluation	39
Higher Education	38
Elementary School Students	36
Test Items	35
Autism	34
Behavior Rating Scales	32
Construct Validity	32
Language Tests	32
More ▼

Publication Type

Journal Articles	378
Reports - Research	365
Reports - Evaluative	81
Speeches/Meeting Papers	59
Tests/Questionnaires	32
Reports - Descriptive	31
Dissertations/Theses -…	14
Information Analyses	11
Numerical/Quantitative Data	11
Guides - Non-Classroom	6
Opinion Papers	3
Book/Product Reviews	1
Books	1
Collected Works - Proceedings	1
Guides - General	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	65
Postsecondary Education	56
Elementary Education	42
Early Childhood Education	29
Secondary Education	21
Primary Education	16
Elementary Secondary Education	15
Middle Schools	14
Grade 1	13
Preschool Education	13
Grade 3	11
Junior High Schools	11
Kindergarten	9
Grade 2	7
Adult Education	6
High Schools	6
Grade 5	5
Grade 8	5
Intermediate Grades	5
Grade 4	4
Grade 6	4
Grade 7	4
Grade 9	4
Grade 10	1
More ▼

Audience

Researchers	41
Practitioners	8
Administrators	3
Teachers	3
Counselors	1

Location

Turkey	11
Canada	10
Australia	9
United Kingdom	9
Pennsylvania	7
Florida	6
Netherlands	6
Sweden	5
United Kingdom (England)	5
China	4
Illinois	4
Japan	4
North Carolina	4
Brazil	3
California	3
Georgia	3
Germany	3
Indiana	3
Israel	3
Italy	3
Jordan	3
Kansas	3
South Africa	3
United States	3
Belgium	2
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	2
No Child Left Behind Act 2001	1
Pell Grant Program	1

What Works Clearinghouse Rating

Test Reliability X

Showing 421 to 435 of 503 results Save | Export

Who Will Watch the Watchers? Setting Standards for Classroom Observers.

Download full text

Livingston, Samuel A.; Sims-Gunzenhauser, Alice – 1995

A study was conducted to provide information for setting two separate standards, the accuracy score and the documentation score, for the Praxis III: Classroom Performance Assessment (Praxis III). Praxis III is intended for making instructional and licensing decisions about beginning teachers. This standard-setting study was a person-judgment…

Descriptors: Beginning Teachers, Classroom Observation Techniques, Documentation, Elementary Secondary Education

Context Bias in the Test of English as a Foreign Language.

Download full text

Angoff, William H. – 1989

This study was undertaken to test the hypothesis that items of the Test of English as a Foreign Language (TOEFL) containing reference to American people, places, customs, etc., tend to favor examinees who have spent some time living in the United States. Two samples of examinees were drawn from the March 1987 TOEFL administration, one tested in…

Descriptors: Context Effect, English (Second Language), Evaluators, Foreign Nationals

Traditional In-Baskets vs. the General Management In-Basket (GMIB).

Download full text

Joines, Richard C. – 1991

The development and validation of the General Management In-Basket (GMIB) is described. The GMIB is a theory-based generic in-basket simulation, designed to assess supervisory and management skills independent of any job classification. Three of the 15 in-basket items in the GMIB are critical and are scored on a 0-5 scale. The remaining 12 items…

Descriptors: Administrator Evaluation, Concurrent Validity, Factor Analysis, Interrater Reliability

Estimation of Interrater and Parallel Forms Reliability for the MCAT Essay.

Mitchell, Karen J.; Anderson, Judith A. – 1987

The Association of American Medical Colleges is conducting research to develop, implement, and evaluate a Medical College Admission Test (MCAT) essay testing program. Essay administration in the spring and fall of 1985 and 1986 suggested that additional research was needed on the development of topics which elicit similar skills and meet standard…

Descriptors: College Entrance Examinations, Essay Tests, Estimation (Mathematics), Generalizability Theory

Essay Reliability: Form and Meaning.

Download full text

Shale, Doug – 1986

This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…

Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests

The Development of a Baccalaureate Outcome Measure Based on a Generic Skills Theory of Human Performance.

Download full text

Peterson, Gary W. – 1983

Even though several national testing firms have developed measures to evaluate the effectiveness of baccalaureate education, there continues to be a general reluctance on the part of faculty in colleges and universities to accept these measures as criteria on which to evaluate educational programs. Some of the resistance appears to lie in the lack…

Descriptors: Bachelors Degrees, Cognitive Processes, Difficulty Level, Essay Tests

The Gesell Screening Examination: Psychometric Properties.

Walker, Richard N. – 1989

In an assessment of the adequacy of the Gesell screening examination as a test instrument, a Gesell Screening Evaluation was given to 400 children semi-annually from their 4th to 6th year. The sample, which was stratified by parent occupation, included 40 girls and 40 boys at 5 age levels. The test battery corresponded with the Gesell Preschool…

Descriptors: Chronological Age, Early Childhood Education, Followup Studies, Interrater Reliability

The Influence of Same Day or Separate Day Observations on the Reliability of Assessment Data.

Yap, Kueh Chin; Capie, William – 1985

The purpose of this study was to compare the relative magnitude of the variance components and generalizability coefficients derived from the Teacher Performance Assessment Instruments (TPAI) data using two different methods of data collection: (1) occasions when observers were in the classroom for simultaneous observation and (2) occasions when…

Descriptors: Analysis of Variance, Classroom Observation Techniques, Data Collection, Elementary Secondary Education

Assessing Writing Skill. Research Monograph No. 11.

Breland, Hunter M.; And Others – 1987

Six university English departments collaborated in this examination of the differences between multiple-choice and essay tests in evaluating writing skills. The study also investigated ways the two tools can complement one another, ways to improve cost effectiveness of essay testing, and ways to integrate assessment and the educational process.…

Descriptors: Comparative Testing, Efficiency, Essay Tests, Higher Education

Interrater Reliability and Internal Consistency of Student and Staff Ratings of Medical Instruction.

Download full text

Dielman, T. E.; Horvatich, Paula K. – 1985

The purposes of this study were to establish the interrater reliability, dimensionality, and internal consistency of an instruction evaluation instrument used at The University of Michigan Medical School. Using the nine-item rating scale, 1,758 student ratings and 88 staff ratings were gathered on 61 faculty. Interrater agreement ranged from .28…

Descriptors: Evaluation Methods, Graduate Medical Education, Higher Education, Interrater Reliability

Assessing Inconsistencies in Standard Setting with the Angoff or Nedelsky Technique.

Download full text

van der Linden, Wim J. – 1982

A latent trait method is presented to investigate the possibility that Angoff or Nedelsky judges specify inconsistent probabilities in standard setting techniques for objectives-based instructional programs. It is suggested that judges frequently specify a low probability of success for an easy item but a large probability for a hard item. The…

Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Interrater Reliability

Reliability of the AAMD Adaptive Behavior Scale-Public School Version.

Peer reviewed

Mayfield, Kathy L.; And Others – Journal of School Psychology, 1984

Investigated interrater reliability of the AAMD Adaptive Behavior Scale-Public School Version in a sample of 31 educable mentally handicapped children who were rated by their parents, special education teacher, classroom teacher, and an independent observer. Results showed ratings of the special education teacher were generally lower. (JAC)

Descriptors: Adjustment (to Environment), Behavior Rating Scales, Children, Elementary Education

Performance Assessments of Creativity: Do They Have Long-Term Stability?

Peer reviewed

Baer, John – Roeper Review, 1994

Two studies are reported that measure the long-term stability of performance assessments involving story-writing and poetry-writing (involving grade four and five students) and story-telling (involving grade two students). The long-term stability of these assessments compares favorably with stability figures for other creativity tests. (Author/JDD)

Descriptors: Creative Thinking, Creativity, Creativity Tests, Elementary Education

A Comparison and Evaluation of Three Commonly Used Autism Scales.

Peer reviewed

Sevin, Jay A.; And Others – Journal of Autism and Developmental Disorders, 1991

This study, involving 24 children or adolescents with pervasive developmental disorders, assessed 3 autism scales: Autism Behavior Checklist, Real Life Rating Scale, and Childhood Autism Rating Scale. The study analyzed interrater reliability, correlations between pairs of the three scales, diagnostic classification cutoff scores, and…

Descriptors: Adaptive Behavior (of Disabled), Behavior Rating Scales, Check Lists, Educational Diagnosis

The Assessment of Preterm Infants' Behavior (APIB): Furthering the Understanding and Measurement of Neurodevelopmental Competence in Preterm and Full-Term Infants

Peer reviewed

Direct link

Als, Heidelise; Butler, Samantha; Kosta, Sandra; McAnulty, Gloria – Mental Retardation and Developmental Disabilities Research Reviews, 2005

The Assessment of Preterm Infants' Behavior (APIB) is a newborn neurobehavioral assessment appropriate for preterm, at risk, and full-term newborns, from birth to 1 month after expected due date. The APIB is based in ethological--evolutionary thought and focuses on the assessment of mutually interacting behavioral subsystems in simultaneous…

Descriptors: Premature Infants, Neonates, Infant Behavior, Measurement Techniques

« Previous Page | Next Page »

Pages: 1 | ... | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34

Journal of Autism and…	25
Journal of Speech, Language,…	13
ProQuest LLC	13
Assessment for Effective…	12
Grantee Submission	8
International Journal of…	7
Measurement in Physical…	7
Educational and Psychological…	6
International Journal of…	6
Research in Developmental…	6
Assessment	5
Behavioral Disorders	5
Online Submission	5
Psychology in the Schools	5
Research in Developmental…	5
ETS Research Report Series	4
Journal of Positive Behavior…	4
Research Papers in Education	4
American Journal on Mental…	3
Autism: The International…	3
Center for Innovation in…	3
Developmental Medicine &…	3
Developmental Psychology	3
Education and Training in…	3
Gerontologist	3
More ▼

Epstein, Michael H.	7
Johnson, Evelyn S.	4
Matson, Johnny L.	4
Tasse, Marc J.	4
Aman, Michael G.	3
Canivez, Gary L.	3
Capie, William	3
Conroy, Maureen A.	3
Crawford, Angela R.	3
Lecavalier, Luc	3
McLeod, Bryce D.	3
Moylan, Laura A.	3
Unal, Zafer	3
Watkins, Marley W.	3
Zheng, Yuzhu	3
Aktas, Mehtap	2
Anna-Maria Fall	2
Atilgan, Hakan	2
Aydin, Selami	2
Benton, Stephen L.	2
Beula M. Magimairaj	2
Bodur, Yasar	2
Botting, Nicola	2
Breland, Hunter M.	2
More ▼

Strengths and Difficulties…	6
Test of English as a Foreign…	6
Autism Diagnostic Observation…	4
Child Behavior Checklist	4
Conners Teacher Rating Scale	4
Adjustment Scales for…	3
Adult Attachment Interview	3
Advanced Placement…	3
Behavioral and Emotional…	3
Childhood Autism Rating Scale	3
Graduate Record Examinations	3
Teacher Performance…	3
ACT Assessment	2
ACTFL Oral Proficiency…	2
Cognitive Abilities Test	2
Hamilton Rating Scale for…	2
Minnesota Multiphasic…	2
National Assessment of…	2
SAT (College Admission Test)	2
Teacher Rating Scale	2
Alabama High School…	1
Basic Reading Inventory	1
Battelle Developmental…	1
Bayley Scales of Infant…	1
Beck Anxiety Inventory	1
More ▼