Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedBuboltz, Walter C., Jr.; Thomas, Adrian; Donnell, Alison J. – Journal of Counseling & Development, 2002
Psychological reactance is an important construct for social scientists. The measure most often used to tap psychological reactance is the Therapeutic Reactance Scale (TRS). However, little research to date has examined the psychometric properties of the TRS. Eight hundred and eighty-three individuals completed the TRS and their responses were…
Descriptors: Factor Structure, Psychological Testing, Psychometrics, Test Reliability
Peer reviewedSanderson, Patricia – Educational Research, 2000
Factor analysis of an initial sample of 368 11-16 year-olds and a test with 1,668 confirmed the reliability and validity of a dance attitude instrument. Two subscales, ballet and male dancers, produced valid measurements of attitudes, but dance performance and presentation scales were less reliable. (SK)
Descriptors: Adolescents, Attitudes, Dance, Foreign Countries
Peer reviewedLindell, Michael K.; Brandt, Christina J.; Whitney, David J. – Applied Psychological Measurement, 1999
Proposes a revised index of interrater agreement for multi-item ratings of a single target. This index is an inverse linear function of the ratio of the average obtained variance to the variance of the uniformly distributed random error. Discusses the importance of sample size for the index. (SLD)
Descriptors: Error of Measurement, Interrater Reliability, Sample Size
Peer reviewedMusante, Linda; Treiber, Frank A.; Davis, Harry C.; Thompson, William O.; Waller, Jennifer L. – Assessment, 1999
Findings related to internal consistency, temporal stability, and principal components structures suggest that the Anger Expression Scale (C. Spielberger and others, 1985) and the Pediatric Anger Expression Scale (G. Jacobs and others, 1989), studied with a sample of 415 youth with a mean age of 14.7 years are acceptably reliable. (SLD)
Descriptors: Adolescents, Anger, Factor Structure, Reliability
Peer reviewedKomaroff, Eugene – Applied Psychological Measurement, 1997
Evaluated coefficient alpha under violations of two classical test theory assumptions: essential tau-equivalence and uncorrelated errors through simulation. Discusses the interactive effects of both violations with true and error scores. Provides empirical evidence of the derivation of M. Novick and C. Lewis (1993). (SLD)
Descriptors: Correlation, Reliability, Simulation, Test Theory
Peer reviewedLehtokangas, Raija; Jarvelin, Kalervo – Journal of Documentation, 2001
Investigates the consistency of different newspapers in their choice of words when writing about the same news events based on a study of three Finnish newspapers. Concludes that expression inconsistency is a sign of a retrieval problem and that query expansion based on semantic relationships can significantly improve retrieval performance on free…
Descriptors: Foreign Countries, Information Retrieval, Newspapers, Reliability
Peer reviewedVacha-Haase, Tammi; Kogan, Lori R.; Tani, Crystal R.; Woodall, Renee A. – Educational and Psychological Measurement, 2001
Used reliability generalization to explore the variance of scores on 10 Minnesota Multiphasic Personality Inventory (MMPI) clinical scales drawing on 1,972 articles in the literature on the MMPI. Results highlight the premise that scores, not tests, are reliable or unreliable, and they show that study characteristics do influence scores on the…
Descriptors: Clinical Diagnosis, Diagnostic Tests, Generalization, Reliability
Schuster, Christof; Smith, David A. – Psychometrika, 2005
The rater agreement literature is complicated by the fact that it must accommodate at least two different properties of rating data: the number of raters (two versus more than two) and the rating scale level (nominal versus metric). While kappa statistics are most widely used for nominal scales, intraclass correlation coefficients have been…
Descriptors: Psychometrics, Statistics, Rating Scales, Correlation
Hall, Kendra M.; Markham, Janet C.; Culatta, Barbara – Communication Disorders Quarterly, 2005
In the present study, the authors investigated the initial development of the Early Expository Comprehension Assessment (EECA) by examining its reliability. The EECA consists of a compare/contrast passage, manipulatives to represent the information in the paragraph, and three response tasks ("Retelling, Mapping, and Comparing"). The authors…
Descriptors: Statistical Analysis, Computation, Preschool Children, Test Reliability
Gagne, Phill; Hancock, Gregory R. – Multivariate Behavioral Research, 2006
Sample size recommendations in confirmatory factor analysis (CFA) have recently shifted away from observations per variable or per parameter toward consideration of model quality. Extending research by Marsh, Hau, Balla, and Grayson (1998), simulations were conducted to determine the extent to which CFA model convergence and parameter estimation…
Descriptors: Sample Size, Factor Analysis, Computation, Models
Mellor, David – Psychological Assessment, 2004
A sample of 917 children, aged 7 to 17 years, their parents, and their teachers each completed the appropriate version of the Strengths and Difficulties Questionnaire (SDQ), and 120 from each group did so again 2 weeks later. The results indicate that the SDQ demonstrates sound interinformant and test-retest reliability. Younger children, whose…
Descriptors: Measures (Individuals), Adolescents, Questionnaires, Test Reliability
Baird, Jo-Anne; Greatorex, Jackie; Bell, John F. – Assessment in Education Principles Policy and Practice, 2004
Marking reliability is purported to be produced by having an effective community of practice. No experimental research has been identified which attempts to verify empirically the aspects of a community of practice that have been observed to produce marking reliability. This research outlines what that community of practice might entail and…
Descriptors: Foreign Countries, Grades (Scholastic), Grading, Interrater Reliability
Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2004
The commentaries include a wealth of insightful and interesting observations and suggestions, and I appreciate each author taking the time to comment on my efforts. In responding to their suggestions, I am inclined to develop a few general points raised in the commentaries a bit further.
Descriptors: Test Validity, Test Reliability, Methods, Statistical Inference
Munson, Benjamin; Brinkman, Kayla N. – American Journal of Speech-Language Pathology, 2004
Two experiments examined whether listening to multiple presentations of recorded speech stimuli influences the reliability and accuracy of judgments of children's speech production accuracy. In Experiment 1, 10 listeners phonetically transcribed words produced by children with phonological impairments after a single presentation and after the word…
Descriptors: Speech, Children, Phonetics, Speech Impairments
Roberts, Felicia; Robinson, Jeffrey D. – Human Communication Research, 2004
This investigation assesses interobserver agreement on conversation analytic (CA) transcription. Four professional CA transcribers spent a maximum of 3 hours transcribing 2.5 minutes of a previously unknown, naturally occurring, mundane telephone call. Researchers unitized transcripts into words, sounds, silences, inbreaths, outbreaths, and laugh…
Descriptors: Interrater Reliability, Discourse Analysis, Semantics, Pragmatics

Direct link
