Publication Date
| In 2026 | 2 |
| Since 2025 | 441 |
| Since 2022 (last 5 years) | 1920 |
| Since 2017 (last 10 years) | 4492 |
| Since 2007 (last 20 years) | 6977 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 831 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 161 |
| Spain | 129 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 111 |
| Taiwan | 108 |
| Netherlands | 102 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
Peer reviewedAnglin, M. Douglas; And Others – Evaluation Review, 1993
Reliability and validity of self-reported behavior within a deviant population are examined using data from 2 interviews with 323 narcotics addicts conducted 10 years apart (1974-75 and 1985-86). Results complement existing reliability and validity studies of alcohol use, and suggest that quality information can be obtained from heroin users. (SLD)
Descriptors: Comparative Testing, Drinking, Drug Addiction, Evaluation Methods
Peer reviewedRothman, A. I.; And Others – Academic Medicine, 1991
A 1990 study of domain-referenced scores from a multiple-station clinical examination for foreign medical graduates investigated identification of essential checklist items, setting of minimum passing scores, consistency of candidate classification, and perceived appropriateness of the number of candidates classified as competent. Results and…
Descriptors: Foreign Medical Graduates, Higher Education, Medical Education, Medical Evaluation
Peer reviewedHenry, Rachael M. – Educational and Psychological Measurement, 1991
Logical difficulties with existing measures of construct implications are examined, and a new instrument that partially overcomes them--the Logical Relations Grid--is described. Empirical data from a study of 28 children and 47 parents in Australia are given in support of instrument reliability and validity. (SLD)
Descriptors: Cognitive Processes, Construct Validity, Elementary School Students, Foreign Countries
Cizek, Gregory J. – Phi Delta Kappan, 1991
This rejoinder to Grant Wiggins on performance assessment suggests that true educational reform will undoubtedly be evidenced by something more substantial than pocket folders bulging with student work. Labeling performance tests "authentic" does not ensure their validity, reliability, or incorruptibility. Such tests are neither replacements nor…
Descriptors: Elementary Secondary Education, Multiple Choice Tests, Performance Based Assessment, Pilot Projects
Peer reviewedCohen, Robert; And Others – Academic Medicine, 1991
The performance of foreign medical school graduates on multistation standardized patient-based tests was used to determine the validity and generalizability of global ratings of their clinical competence made by expert examiners. Results suggest that these ratings can be used as an effective form of assessment in this context. (Author/MSE)
Descriptors: Foreign Medical Graduates, Higher Education, Holistic Approach, Medical Education
Peer reviewedFeil, Edward G.; Becker, Wesley C. – Behavioral Disorders, 1993
The Walker/Severson Systematic Screening for Behavior Disorders measure was revised for use with preschool children. The revision consists of three hierarchical stages of increasingly time-consuming methodologies: (1) teacher rankings, (2) teacher ratings, and (3) direct behavioral observations. Testing with 121 children demonstrated significant…
Descriptors: Behavior Disorders, Behavior Rating Scales, Preschool Children, Preschool Education
Putnam, Frank W.; And Others – Child Abuse and Neglect: The International Journal, 1993
Evaluation of the Child Dissociative Checklist found it to be a reliable and valid observer report measure of dissociation in children, including sexually abused girls and children with dissociative disorder and with multiple personality disorder. The checklist, which is appended, is intended as a clinical screening instrument and research measure…
Descriptors: Check Lists, Children, Emotional Disturbances, Psychological Evaluation
Peer reviewedGellman, Estelle S. – Action in Teacher Education, 1993
Portfolio assessment can be a valuable tool in assessing professional proficiency in teachers if appropriate attention is given to issues of reliability and validity. The Teaching Assessment Project at Stanford University has explored portfolios as an alternative to traditional methods of teacher evaluation. (IAH)
Descriptors: Elementary Secondary Education, Portfolios (Background Materials), Teacher Competencies, Teacher Competency Testing
Peer reviewedArmstrong, Ronald D.; And Others – Journal of Educational Statistics, 1994
A network-flow model is formulated for constructing parallel tests based on classical test theory while using test reliability as the criterion. Practitioners can specify a test-difficulty distribution for values of item difficulties as well as test-composition requirements. An empirical study illustrates the reliability of generated tests. (SLD)
Descriptors: Algorithms, Computer Assisted Testing, Difficulty Level, Item Banks
Peer reviewedZimmerman, Donald W.; And Others – Applied Psychological Measurement, 1993
Some of the methods originally used to find relationships between reliability and power associated with a single measurement are extended to difference scores. Results, based on explicit power calculations, show that augmenting the reliability of measurement by reducing error score variance can make significance tests of difference more powerful.…
Descriptors: Equations (Mathematics), Error of Measurement, Individual Differences, Mathematical Models
Peer reviewedHumphreys, Lloyd G.; And Others – Applied Psychological Measurement, 1993
Two articles discuss the controversy about the relationship between reliability and the power of significance tests in response to the discussion of Donald W. Zimmerman, Richard H. Williams, and Bruno D. Zumbo. Lloyd G. Humphreys emphasizes the differences between what statisticians can do and constraints on researchers. Zimmerman, Williams, and…
Descriptors: Error of Measurement, Individual Differences, Power (Statistics), Research Methodology
Peer reviewedRoznowski, Mary; Smith, Marna L. – Intelligence, 1993
Measurement and psychometric quality of the Sternberg task (S. Sternberg, 1966, 1969), a memory search task, was investigated with 78 undergraduates. Individual performance was fairly homogeneous across responses, fairly unstable over time, and fairly stable across stimulus content. Implications for individual differences research are discussed.…
Descriptors: Cognitive Tests, Evaluation Methods, Higher Education, Individual Differences
Peer reviewedMatson, Johnny L.; Smiroldo, Brandi B. – Research in Developmental Disabilities, 1997
A study tested the validity of the Diagnostic Assessment for the Severely Handicapped-II (DASH-II) for determining the presence of mania (bipolar disorder) in 22 individuals with severe mental retardation. Results found the mania subscale to be internally consistent and able to be used to classify manic and control subjects accurately. (Author/CR)
Descriptors: Adults, Clinical Diagnosis, Disability Identification, Evaluation Methods
Peer reviewedDozois, David J. A.; Ahnberg, Jamie L.; Dobson, Keith S. – Psychological Assessment, 1998
Provides psychometric information on the second edition of the Beck Depression Inventory (BDI-II) (A. Beck, R. Steer, and G. Brown, 1996) for internal consistency, factorial validity, and gender differences. Results indicate that the BDI-II is a stronger instrument than its predecessor in terms of factor structure. (SLD)
Descriptors: Depression (Psychology), Factor Analysis, Factor Structure, Psychometrics
Peer reviewedScarsellone, Jana M. – Journal of Speech, Language, and Hearing Research, 1998
Hearing in Noise Test (HINT) list equivalency was examined using 24 listeners (ages 60 to 70) with sensorineural hearing impairments. Four speech conditions were tested, including a quiet condition and three noise conditions. Results found that for the three noise conditions, all lists were within 2dB of the means, indicating list equivalency.…
Descriptors: Auditory Evaluation, Auditory Perception, Communication Research, Generalization


