Publication Date
In 2025 | 205 |
Since 2024 | 705 |
Since 2021 (last 5 years) | 2293 |
Since 2016 (last 10 years) | 4594 |
Since 2006 (last 20 years) | 6899 |
Descriptor
Test Reliability | 14762 |
Test Validity | 9771 |
Test Construction | 4248 |
Foreign Countries | 3657 |
Psychometrics | 2361 |
Factor Analysis | 2251 |
Measures (Individuals) | 1717 |
Evaluation Methods | 1401 |
Higher Education | 1384 |
Correlation | 1234 |
Questionnaires | 1228 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 452 |
Practitioners | 319 |
Teachers | 128 |
Administrators | 73 |
Policymakers | 33 |
Counselors | 31 |
Students | 17 |
Parents | 10 |
Community | 6 |
Support Staff | 5 |
Location
Turkey | 797 |
Australia | 236 |
Canada | 205 |
China | 195 |
Indonesia | 142 |
Spain | 124 |
United States | 121 |
United Kingdom | 117 |
Germany | 106 |
Taiwan | 103 |
Netherlands | 99 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 2 |
Meets WWC Standards with or without Reservations | 2 |
Does not meet standards | 1 |

McLeod, P. J. – Evaluation and the Health Professions, 1991
Faculty opinions of an evaluation program for medical school clinical tutors were obtained through a survey of 24 undergraduate clinical tutors. Although students had been using the evaluation instrument to rate teachers for five years, faculty expressed many reservations about its reliability and validity. (SLD)
Descriptors: Clinical Teaching (Health Professions), Evaluation Methods, Higher Education, Medical Education

Bontempo, Robert – Journal of Cross-Cultural Psychology, 1993
Describes a method for assessing the quality of translations based on item response theory (IRT). Results from the IRT technique with French and Chinese versions of a scale measuring individualism-collectivism for samples of 250 U.S., 357 French, and 290 Chinese undergraduates show how several biased items are detected. (SLD)
Descriptors: Chinese, Comparative Testing, Cross Cultural Studies, Foreign Countries

Bers, Trudy H.; Smith, Kerry E. – Community College Review, 1990
Describes a study of the validity and reliability of a writing skills assessment test taken by 4,284 2-year college students in 1986-87. Assesses interrater reliability, influences of nonperformance factors (e.g., gender, native language, and form of test), predictive validity of test for future performance, and implications of findings. (DMM)
Descriptors: Basic Writing, Community Colleges, High Risk Students, Predictive Validity

Merrell, Kenneth W. – School Psychology Review, 1993
Constructed School Social Behavior Scales (SSBS) to include teacher-related and peer-related forms of social competence and antisocial behavior. Standardized SSBS using teacher ratings on 1,858 kindergarten through grade 12 students across United States Evidence presented from several related studies in present investigation indicated that SSBS…
Descriptors: Antisocial Behavior, Behavior Rating Scales, Elementary School Students, Elementary Secondary Education

Miller, Jeff – College Teaching, 1999
A college faculty member who has graded Advanced Placement exam essays on U.S. government and politics, taken mostly by high school juniors and seniors, suggests that high school teachers and college faculty who assess the essays are not the best qualified persons to do so and that despite efforts to ensure consistency, the resulting scores are…
Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

McLauchlan, William – College Teaching, 1999
A faculty consultant to the Educational Testing Service for advanced placement (AP) test reading in U.S. government and politics responds to an article criticizing essay evaluation methods and criteria, finding in it a fundamental misunderstanding of the AP reading process and explaining why the essays are subject to less scrutiny for style,…
Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

Ghuman, Jaswinder Kaur; Peebles, Claire D.; Ghuman, Harinder Singh – Infants and Young Children, 1998
A review of 36 social interaction measures found that there are no measures available to evaluate infants and preschool children's basic capacity for social interaction. The available measures are described and grouped into parent-child interaction, social skills, social competence, play, adaptive behavior, communication, general development, and…
Descriptors: Adaptive Behavior (of Disabled), Behavior Problems, Emotional Disturbances, Evaluation Methods

O'Neil, Harold F.; Abedi, Jamal – Journal of Educational Research, 1996
Describes research on the development of a measure of student metacognition. The brief, domain-independent measure serves as a collateral measure in construct validation, supporting exploration of the self-regulatory demands of performance assessment. Results show that metacognition can be directly and explicitly measured in the context of…
Descriptors: Alternative Assessment, Cognitive Ability, College Students, Elementary Secondary Education
Kane, Thomas J.; Staiger, Douglas O. – Brookings Papers on Education Policy, 2002
By the spring of 2000, forty states had begun using student test scores to rate school performance. Twenty states have gone a step further and are attaching explicit monetary rewards or sanctions to a school's test performance. In this paper, the authors focus on accountability programs in which states measure the effectiveness of individual…
Descriptors: Elementary Schools, Accountability, Scores, Risk
Schuwirth, L.; Gorter, S.; Van der Heijde, D.; Rethans, J. J.; Brauer, J.; Houben, H.; Van der Linden, S.; Van der Vleuten, C.; Scherpbier, A. – Advances in Health Sciences Education, 2005
Introduction: For postgraduate training of doctors there is a need for valid and reliable instruments to assess their daily performance. Various instruments have been suggested, some of which use incognito simulated patients (SPs). These methods are resource intensive. Computerised Case-based testing (CCT) is logistically simpler and may still…
Descriptors: Check Lists, Performance Based Assessment, Testing, Predictive Validity
Nicholls, Tonia L.; Brink, Johann; Desmarais, Sarah L.; Webster, Christopher D.; Martin, Mary-Lou – Assessment, 2006
A new assessment scheme--the Short-Term Assessment of Risk and Treatability (START)-- presents a workable method for assessing risks to self and others encountered in mentally and personality disordered clients. This study aimed to demonstrate (a) prevalence and severity of risk behaviors measured by the START, (b) psychometric properties of…
Descriptors: Risk, Mental Disorders, Aggression, Incidence
Hjemdal, Odin; Friborg, Oddgeir; Stiles, Tore C.; Martinussen, Monica; Rosenvinge, Jan H. – Measurement and Evaluation in Counseling and Development, 2006
In this study, the Resilience Scale for Adolescents (READ) was developed with confirmatory factor analysis and cross-validated factor model. The results show that the READ has sound psychometric qualities and that it measures all the central aspects of the psychological construct of resiliency. (Contains 4 tables.)
Descriptors: Measures (Individuals), Psychometrics, Factor Analysis, Factor Structure
Shou, Priscilla – 1993
The Singer-Loomis Inventory of Personality (SLIP) was developed by two Jungian analysts to allow examination of personality from the perspective of Jung's typology and to solve problems perceived with the Myers-Briggs Type Indicator, based on Jungian dichotomies. The SLIP is designed to clarify and describe the user's personality based on the…
Descriptors: Adults, Classification, Cognitive Style, Extraversion Introversion
Reckase, Mark D. – 1996
The American College Testing Program (ACT) is field testing a portfolio assessment model. The field test is designed to determine whether it is possible to implement a portfolio assessment model on a national level that will result in scores that are of sufficient reliability and validity that they can be used for decisions at the student level.…
Descriptors: College Entrance Examinations, Cooperation, Field Tests, High Schools
Perez, Kristina M. – 1996
The KeyMath Revised is a power test that measures the understanding and application of mathematics skills and concepts. It is individually administered and is intended for students from kindergarten through the ninth grade to determine student mastery of mathematics concepts. The revised version is designed to be user-friendly for the student and…
Descriptors: Comprehension, Curriculum Development, Diagnostic Tests, Educational Diagnosis