Publication Date
In 2025 | 205 |
Since 2024 | 705 |
Since 2021 (last 5 years) | 2293 |
Since 2016 (last 10 years) | 4594 |
Since 2006 (last 20 years) | 6899 |
Descriptor
Test Reliability | 14762 |
Test Validity | 9771 |
Test Construction | 4248 |
Foreign Countries | 3657 |
Psychometrics | 2361 |
Factor Analysis | 2251 |
Measures (Individuals) | 1717 |
Evaluation Methods | 1401 |
Higher Education | 1384 |
Correlation | 1234 |
Questionnaires | 1228 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 452 |
Practitioners | 319 |
Teachers | 128 |
Administrators | 73 |
Policymakers | 33 |
Counselors | 31 |
Students | 17 |
Parents | 10 |
Community | 6 |
Support Staff | 5 |
Location
Turkey | 797 |
Australia | 236 |
Canada | 205 |
China | 195 |
Indonesia | 142 |
Spain | 124 |
United States | 121 |
United Kingdom | 117 |
Germany | 106 |
Taiwan | 103 |
Netherlands | 99 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 2 |
Meets WWC Standards with or without Reservations | 2 |
Does not meet standards | 1 |

Vu, Nu Viet; And Others – Academic Medicine, 1992
The use of a performance-based assessment of senior medical students' clinical skills utilizing standardized patients was evaluated, with 6,804 student-patient encounters involving 405 students over 6 years. Results provide evidence for test security, content validity, construct validity, reliability, and test ability to discriminate a wide range…
Descriptors: Clinical Experience, Evaluation Methods, Higher Education, Medical Education

Dornyei, Zoltan; Katona, Lucy – Language Testing, 1992
A total of 102 university English majors were administered 4 different language tests to form a General Language Proficiency measure against which the C-test was evaluated. Results confirmed its reliability and validity and also provided data on text difficulty/appropriateness, word structure, content, and different scoring methods. (13…
Descriptors: College Students, English (Second Language), Higher Education, Language Proficiency

Mittenberg, Wiley; And Others – Psychological Assessment, 1992
Normative data for the Wechsler Memory Scale-Revised were derived empirically using a sample of 50 volunteers between 25 and 34 years of age, who matched U.S. Census data on demographic characteristics. Differences between these empirical norms and published norms that were estimated statistically appear clinically significant. (SLD)
Descriptors: Adults, Census Figures, Demography, Diagnostic Tests

Bullis, Michael; Reiman, John – Exceptional Children, 1992
The Transition Competence Battery for Deaf Adolescents and Young Adults (TCB) measures employment and independent living skills. The TCB was standardized on students (N from 180 to 230 for the different subtests) from both mainstreamed and residential settings. Item statistics and subtest reliabilities were adequate; evidence of construct validity…
Descriptors: Adolescents, Competence, Deafness, Education Work Relationship

Cohen, Robert; And Others – Academic Medicine, 1991
A study evaluated the feasibility of an objective structured clinical examination to assess the competence of foreign medical school graduates, clinical clerks, and interns to address clinical ethical situations. The University of Toronto's experience with the measure found it useful but in need of improvement. (MSE)
Descriptors: Clinical Experience, Ethics, Evaluation Criteria, Evaluation Methods

Strauman, Timothy J.; Wetzler, Scott – Multivariate Behavioral Research, 1992
Scale-level factor analyses are reported for 2 self-report measures of psychopathology, the Symptom Checklist-90-R (SCL-90) and the Millon Clinical Multiaxial Inventory (MCMI), using 130 psychiatric inpatients and outpatients. Used separately, the measures offer limited interpretability of scale profiles. Their combined use permits differentiation…
Descriptors: Adults, Anxiety, Clinical Diagnosis, Comparative Testing

Miller, Linda T.; Vernon, Philip A. – Intelligence, 1992
The general intelligence factor (g) was investigated using 170 university students across three batteries of ability measures: (1) a short-term memory battery; (2) the Multidimensional Aptitude Battery; and (3) a reaction time battery. Results support the notion of g and suggest short-term memory as an essential aspect of intelligence. (SLD)
Descriptors: Ability, Aptitude Tests, Cognitive Processes, College Students

Nickell, Pat – Social Education, 1992
Presents an interview with Grant Wiggins, executive director of an educational consulting group. Discusses performance assessment and authenticity in testing. Addresses topics such as the student as worker, diploma as exhibition of mastery, authenticity of assessment, fairness, and skill as opposed to knowledge. Urges examining desired social…
Descriptors: Academic Standards, Educational Objectives, Experiential Learning, Mastery Tests

Kon, Jane Heckley; Martin-Kniep, Giselle O. – Social Education, 1992
Describes a case study to determine whether performance tests are a feasible alternative to multiple-choice tests. Examines the difficulties of administering and scoring performance assessments. Explains that the study employed three performance tests and one multiple-choice test. Concludes that performance test administration and scoring was no…
Descriptors: Educational Objectives, Educational Research, Educational Testing, Geography Instruction

Shepard, Lorrie A. – Early Childhood Research Quarterly, 1992
Critiques Walker's article in this issue. Argues that Walker's data do not meet technical standards regarding individual placement tests for normative comparisons, interjudge reliability, or predictive validity, and therefore do not justify the use of the Gesell test to place children in developmental kindergarten or transitional first grade. (GLR)
Descriptors: Chronological Age, Early Childhood Education, Intelligence Quotient, Maturity (Individuals)

Barton, Richard M.; And Others – Educational and Psychological Measurement, 1994
The factorial validity and reliability of the newly developed Teacher Effectiveness Survey was examined in a sample of 390 principals. Results indicate that the instrument consists of three subscales (Instruction, Leadership, and Interpersonal/Professional), each measuring a different facet of a teacher's abilities. (SLD)
Descriptors: Ability, College Graduates, Factor Analysis, Factor Structure

Lamborn, Susie D. – Developmental Psychology, 1994
A 10-step scale for assessing development of understanding relationships between honesty and kindness was developed and administered to 113 youths. Results indicated that development moved through 3 stages, as youths age 9-12 demonstrated abstract concepts of honesty and kindness; age 13-15 demonstrated simple abstract relations; and age 16-20…
Descriptors: Abstract Reasoning, Age Differences, Altruism, Children

Griffin, Patrick E. – Australian Journal of Education, 1990
The paper outlines the development of a scale describing the progression of literacy skills, based on similar scales used in second-language acquisition. The scale is shown to have high reliability and appropriate criterion validity. Use of the scale to analyze student reading development is illustrated. Appended are descriptions of the scale's…
Descriptors: Concurrent Validity, Elementary Secondary Education, Evaluation Methods, Literacy

Cahan, Sorel; Gejman, Alicia – Roeper Review, 1993
The constancy of intelligence quotients (IQs) of 161 gifted Israeli children, obtained initially in grades K-4 and retested 1-4 years later, was examined. Results indicated that 86% still qualified as gifted on the retest, with mean differences of five to eight IQ points. Performance scores tended to remain constant, whereas verbal scores tended…
Descriptors: Ability Identification, Elementary Education, Foreign Countries, Gifted

Smith, Richard Merrill – Academic Medicine, 1993
A University of Hawaii study compared objective and subjective assessments of the three-step triple jump examination which tests medical students' clinical problem-solving processes. Subjects were 58 first-year students. Results found the subjective assessments were more consistent across problems of varying difficulty level than were objective…
Descriptors: Case Studies, Difficulty Level, Higher Education, Interrater Reliability