Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 11 |
| Since 2017 (last 10 years) | 45 |
| Since 2007 (last 20 years) | 123 |
Descriptor
Source
Author
| Alonzo, Julie | 26 |
| Tindal, Gerald | 24 |
| Lai, Cheng-Fei | 16 |
| Anderson, Daniel | 14 |
| Park, Bitnara Jasmine | 13 |
| Irvin, P. Shawn | 8 |
| Nese, Joseph F. T. | 7 |
| Petscher, Yaacov | 5 |
| Gill, Brian | 4 |
| Saez, Leilani | 4 |
| Benton, Stephen L. | 3 |
| More ▼ | |
Publication Type
| Numerical/Quantitative Data | 252 |
| Reports - Research | 127 |
| Reports - Evaluative | 68 |
| Reports - Descriptive | 41 |
| Tests/Questionnaires | 27 |
| Speeches/Meeting Papers | 22 |
| Journal Articles | 17 |
| Guides - Non-Classroom | 10 |
| Collected Works - General | 4 |
| Guides - General | 3 |
| Books | 1 |
| More ▼ | |
Education Level
Location
| Florida | 10 |
| New York | 8 |
| Illinois | 7 |
| Nebraska | 7 |
| United States | 7 |
| California | 6 |
| Maryland | 5 |
| Massachusetts | 5 |
| Pennsylvania | 5 |
| North Carolina | 4 |
| Texas | 4 |
| More ▼ | |
Laws, Policies, & Programs
| American Recovery and… | 6 |
| Race to the Top | 6 |
| Individuals with Disabilities… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Boldt, R. F. – 1992
The Test of Spoken English (TSE) is an internationally administered instrument for assessing nonnative speakers' proficiency in speaking English. The research foundation of the TSE examination described in its manual refers to two sources of variation other than the achievement being measured: interrater reliability and internal consistency.…
Descriptors: Adults, Analysis of Variance, Interrater Reliability, Language Proficiency
Ezzelle, Carol; Setzer, J. Carl – GED Testing Service, 2009
This manual was written to provide technical information regarding the 2002 Series GED (General Educational Development) Tests. Throughout this manual, documentation is provided regarding the development of the GED Tests, data collection activities, as well as reliability and validity evidence. The purpose of this manual is to provide evidence…
Descriptors: High School Equivalency Programs, Testing Programs, Test Validity, Test Reliability
Jamgochian, Elisa; Park, Bitnara Jasmine; Nese, Joseph F. T.; Lai, Cheng-Fei; Saez, Leilani; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2010
In this technical report, we provide reliability and validity evidence for the easyCBM[R] Reading measures for grade 2 (word and passage reading fluency and multiple choice reading comprehension). Evidence for reliability includes internal consistency and item invariance. Evidence for validity includes concurrent, predictive, and construct…
Descriptors: Grade 2, Reading Comprehension, Testing Programs, Reading Fluency
Tindal, Gerald; Lee, Daesik; Geller, Leanne Ketterlin – Behavioral Research and Teaching, 2008
In this paper we review different methods for teachers to recommend accommodations in large scale tests. Then we present data on the stability of their judgments on variables relevant to this decision-making process. The outcomes from the judgments support the need for a more explicit model. Four general categories are presented: student…
Descriptors: Teachers, Reliability, Decision Making, Testing Accommodations
Peer reviewedLivingston, Samuel A.; Wingersky, Marilyn A. – Journal of Educational Measurement, 1979
Procedures are described for studying the reliability of decisions based on specific passing scores with tests made up of discrete items and designed to measure continuous rather than categorical traits. These procedures are based on the estimation of the joint distribution of true scores and observed scores. (CTM)
Descriptors: Cutting Scores, Decision Making, Efficiency, Error of Measurement
Allen, Jeff; Bassiri, Dina; Noble, Julie – ACT, Inc., 2009
Educational accountability has grown substantially over the last decade, due in large part to the No Child Left Behind Act of 2001. Accordingly, educational researchers and policymakers are interested in the statistical properties of accountability models used for NCLB, such as status, improvement, and growth models; as well as others that are not…
Descriptors: Academic Achievement, High School Students, Accountability, Statistical Analysis
Peer reviewedLester, David – Omega: Journal of Death and Dying, 1991
Published Lester Attitude toward Death Scale for first time, together with data on its reliability and validity. Notes that scale is different from other fear of death scales in its use of scaled value approach that permits measure of inconsistency in attitudes. (Author)
Descriptors: Attitude Measures, Death, Test Reliability, Test Validity
Sartain, Lauren; Stoelinga, Sara Ray; Brown, Eric R. – Consortium on Chicago School Research, 2011
This report summarizes findings from a two-year study of Chicago's Excellence in Teaching Pilot, which was designed to drive instructional improvement by providing teachers with evidence-based feedback on their strengths and weaknesses. The pilot consisted of training and support for principals and teachers, principal observations of teaching…
Descriptors: Evidence, Feedback (Response), Public Schools, Teacher Effectiveness
Angoff, William H. – 1989
This study was undertaken to test the hypothesis that items of the Test of English as a Foreign Language (TOEFL) containing reference to American people, places, customs, etc., tend to favor examinees who have spent some time living in the United States. Two samples of examinees were drawn from the March 1987 TOEFL administration, one tested in…
Descriptors: Context Effect, English (Second Language), Evaluators, Foreign Nationals
New Mexico Public Education Department, 2007
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
Peer reviewedWood, Terry M.; Safrit, Margaret J. – Research Quarterly for Exercise and Sport, 1984
A proposed model for estimating psychomotor test battery reliability, based upon canonical correlation analysis, is described. (Author/JMK)
Descriptors: Evaluation Criteria, Multivariate Analysis, Physical Education, Psychomotor Skills
Hoffman, R. Gene; Wise, Lauress L. – 2000
Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is…
Descriptors: Achievement, Classification, Observation, Probability
Yiu, Edwin M.-L.; Ng, Chi-Yan – Clinical Linguistics and Phonetics, 2004
One of the factors that affects the reliability of perceptual voice evaluation is the rating scale. Equal-appearing interval (EAI) and visual analogue (VA) scales are the two most common scales used and have attracted much attention in recent studies of perceptual voice evaluation. Available findings are contradictory, with one study finding the…
Descriptors: Test Reliability, Measurement Techniques, Rating Scales, Phonetics
Lee, Guemin; Frisbie, David A. – 1997
Previous studies have indicated that the reliability of test scores composed of testlets might be overestimated by conventional item-based reliability estimation methods (R. Thorndike, 1953; A. Anastasi, 1988; S. Sireci, D. Thissen, and H. Wainer, 1991; H. Wainer and D. Thissen, 1996). This study used generalizability theory to investigate the…
Descriptors: Estimation (Mathematics), Generalizability Theory, Reliability, Scores
Subkoviak, Michael J. – 1985
Current methods of obtaining reliability coefficients for mastery tests are laborious from a practitioner's perspective. Some methods require two test administrations; while others require access to computer facilities and/or advanced measurement and statistical procedures. This report provides tables from which practitioners can read such…
Descriptors: Estimation (Mathematics), Mastery Tests, Statistical Studies, Tables (Data)

Direct link
