Publication Date
| In 2026 | 2 |
| Since 2025 | 454 |
| Since 2022 (last 5 years) | 1933 |
| Since 2017 (last 10 years) | 4505 |
| Since 2007 (last 20 years) | 6990 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 837 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 161 |
| Spain | 129 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 111 |
| Taiwan | 108 |
| Netherlands | 102 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
Peer reviewedHuynh, Huynh – Journal of Educational Measurement, 1976
Within the beta-binomial Bayesian framework, procedures are described for the evaluation of the kappa index of reliability on the basis of one administration of a domain-referenced test. Major factors affecting this index include cutoff score, test score variability and test length. Empirical data which substantiate some theoretical trends deduced…
Descriptors: Criterion Referenced Tests, Decision Making, Mathematical Models, Probability
Peer reviewedSubkoviak, Michael J. – Journal of Educational Measurement, 1976
A number of different reliability coefficients have recently been proposed for tests used to differentiate between groups such as masters and nonmasters. One promising index is the proportion of students in a class that are consistently assigned to the same mastery group across two testings. The present paper proposes a single test administration…
Descriptors: Criterion Referenced Tests, Mastery Tests, Mathematical Models, Probability
Peer reviewedLovett, Hubert T. – Educational and Psychological Measurement, 1977
The analysis of variance model for estimating reliability in norm referenced tests is extended to criterion referenced tests. The essential modification is that the criterion or cut-off score is substituted for the population mean. An example and discussion are presented. (JKS)
Descriptors: Analysis of Variance, Criterion Referenced Tests, Cutting Scores, Test Reliability
Peer reviewedWilliams, Thomas O., Jr.; Eaves, Ronald C. – Psychology in the Schools, 2002
Examines reliability of test scores for the Pervasive Developmental Disorders Rating Scale (PDDRS), a screening instrument used in the assessment of autistic disorder. The results indicated that coefficient alpha for the PDDRS Total Score was adequate for screening purposes for both age groups studied. The results of the test-retest study also…
Descriptors: Autism, Pervasive Developmental Disorders, School Psychology, Screening Tests
Peer reviewedPatrick, Donald L.; Edwards, Todd C.; Topolski, Tari D. – Journal of Adolescence, 2002
Presents the psychometric properties of the Youth Quality of Life Instrument-Research Version (YQOL-R) conceptual module. Item and factor analyses confirmed the hypothesized conceptual model derived from previous qualitative research. The scales of the YQOL-R showed acceptable internal consistency, reproducibility, expected associations with other…
Descriptors: Adolescents, Models, Psychometrics, Quality of Life
Peer reviewedRogers, James R.; Lewis, Mary Miller; Subich, Linda Mezydlo – Journal of Counseling & Development, 2002
Investigates validity of the Suicide Assessment Checklist (SAC) in a sample of 1,969 admissions to a psychiatric emergency crisis center. Supporting construct related validity, total score differences were found in the expected directions as a function of referral reason. Convergent validity was based on observed correlations between selected SAC…
Descriptors: Counseling, Evaluation Methods, Psychiatric Services, Suicide
Peer reviewedVillaume, William A.; Brown, Mary Helen – International Journal of Listening, 1999
Notes that presbycusis, hearing loss associated with aging, may be marked by a second dimension of hearing loss, a loss in vocalic sensitivity. Reports on the development of the Vocalic Sensitivity Test, which controls for the verbal elements in speech while also allowing for the vocalics to exercise their normal metacommunicative function of…
Descriptors: Hearing Impairments, Higher Education, Listening Comprehension, Older Adults
Peer reviewedLewandowski, Lawrence J.; Martens, Brian K. – Journal of Reading, 1990
Provides an approach for selecting and evaluating both group and individually administered standardized tests of reading. Reviews considerations of the quality of test development; test content; test reliability and validity; and concerns of cost and time investment. Presents sample ratings of two common instruments. (RS)
Descriptors: Reading Tests, Secondary Education, Standardized Tests, Test Reliability
Peer reviewedEllers, Robert A.; And Others – Journal of School Psychology, 1989
Examined test-retest stability of Behavior Rating Profile for students grades l-12 (N=198), parents (N=212), and teachers (N=176) on 3 norm-referenced scales. Found Teacher Rating scale reliable across all grades for screening and eligibility, Parent Rating scale reliable for Grade 3-12 screening and Grade 3-6,ll, and l2, eligibility. Found…
Descriptors: Behavior Rating Scales, Elementary Secondary Education, Special Education, Test Reliability
Peer reviewedHumphreys, Lloyd G.; Drasgow, Fritz – Applied Psychological Measurement, 1989
Issues arising from difference scores with zero reliability that nevertheless allow a powerful test of change are discussed. Issues include the appropriateness of underlying statistical models for psychological data and the relationship between difference scores and power. Increases in reliability always increase power for a fixed effect size.…
Descriptors: Goodness of Fit, Mathematical Models, Power (Statistics), Psychometrics
Peer reviewedGlutting, Joseph J. – Journal of School Psychology, 1989
Introduces Stanford-Binet Intelligence Scale-Fourth Edition (SB4) as an attempt to revitalize Stanford-Binet by maintaining links with previous editions while simultaneously incorporating more recent developments found in other popular tests of intelligence. Discusses the SB4's theoretical foundation, materials and administration, scaling,…
Descriptors: Intelligence Tests, Models, Test Reliability, Test Use
Peer reviewedDiamond, Adele; And Others – Developmental Psychology, 1994
Found that faulty test procedures may explain why infants sometimes locate hidden objects more easily in multiple-well tests than in two-well trials. Also found that errors in seven-well tests were not evenly distributed but occurred disproportionately in the direction of the previously correct well, suggesting that memory and inhibition are both…
Descriptors: Infants, Inhibition, Memory, Recall (Psychology)
Peer reviewedKennamer, J. David – Journalism Quarterly, 1992
Investigates the use of "vague quantifiers" (terms such as "often,""sometimes,""rarely," or "never") in communication research. Finds that these words do not always mean the same thing to different people, and thus may not constitute interval scales. Suggests that research outcomes based upon such…
Descriptors: Communication Research, Higher Education, Research Methodology, Research Problems
Peer reviewedHansen, Jo-Ida C.; And Others – Journal of Vocational Behavior, 1993
Multidimensional scaling was applied to Women-in-General (n=300) and Men-in-General (n=300) samples of the Strong Interest Inventory. Participants were matched on occupational title, obtaining two-dimensional solutions that demonstrated gender differences in the underlying structure of vocational interests. (SK)
Descriptors: Interest Inventories, Multidimensional Scaling, Sex Differences, Test Reliability
Elliott, Diana M.; Briere, John – Child Abuse and Neglect: The International Journal, 1992
In a national survey of 2,963 professional women, the Trauma Symptom Checklist was found to be reliable and to display predictive validity in regard to childhood sexual victimization. Women who reported a sexual abuse history scored significantly higher than did women with no such history on each of the six subscales. (Author/DB)
Descriptors: Adults, Child Abuse, Females, Sexual Abuse


