Publication Date
| In 2026 | 2 |
| Since 2025 | 462 |
| Since 2022 (last 5 years) | 1941 |
| Since 2017 (last 10 years) | 4513 |
| Since 2007 (last 20 years) | 6998 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10004 |
| Test Construction | 4369 |
| Foreign Countries | 3831 |
| Psychometrics | 2428 |
| Factor Analysis | 2301 |
| Measures (Individuals) | 1785 |
| Evaluation Methods | 1410 |
| Higher Education | 1391 |
| Questionnaires | 1261 |
| Factor Structure | 1248 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 838 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 162 |
| Spain | 129 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 112 |
| Taiwan | 108 |
| Netherlands | 102 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
Severy, Lawrence J. – 1974
Issues relevant to the nature of attitudes are discussed. The reader is referred to works indexing a variety of existent attitude scales. The way in which one constructs, administers, scores, interprets, and presents findings of an original attitude measuring device is discussed comprehensively, and yet in a nontechnical fashion for…
Descriptors: Attitude Measures, Attitudes, Scoring, Test Construction
Gibson, Dennis L. – 1970
The Hill Interaction Matrix-G (HIM-G), a 72 item questionnaire, is a shorter method for the analysis of verbal interaction in small groups. Intended as a substitute for the more precise Hill Interaction Matrix Rating System (HIM-SS) this version can be rated within twenty minutes after observation of the group. A Fortran computer scoring program…
Descriptors: Group Dynamics, Group Therapy, Interaction Process Analysis, Measurement Instruments
Stedman, Donald J.; And Others – 1967
This paper reports on the administration of the Preschool Attainment Record (PAR), which is used to estimate developmental levels in children from 6 months to 8 years of age. The PAR was given to 17 5-year-old disadvantaged boys and girls of average intelligence. To reduce the tendency of evaluators to inflate scores, the test was administered by…
Descriptors: Child Development, Preschool Children, Sex Differences, Test Reliability
Haladyna, Thomas M. – 1974
Classical test theory has been rejected for application to criterion-referenced (CR) tests by most psychometricians due to an expected lack of variance in scores and other difficulties. The present study was conceived to resolve the variance problem and explore the possibility that classical test theory is both appropriate and desirable for some…
Descriptors: Criterion Referenced Tests, Error of Measurement, Sampling, Test Construction
Manpower Administration (DOL), Washington, DC. U.S. Training and Employment Service. – 1969
To compare the reliability of performance on recorded dictation tests with performance on live tests, 216 university students who were nearing completion of an intermediate shorthand course and 26 job applicants seeking stenographic positions were divided into 10 groups, with five receiving live dictation and five receiving recorded dictation. The…
Descriptors: Comparative Analysis, Comparative Testing, Evaluation, Performance Tests
Test Service Bulletin, 1952
Some aspects of test reliability are discussed. Topics covered are: (1) how high should a reliability coefficient be?; (2) two factors affecting the interpretation of reliability coefficients--range of talent and interval between testings; (3) some common misconceptions--reliability of speed tests, part vs. total reliability, reliability for what…
Descriptors: Bulletins, Correlation, Scores, Statistical Analysis
Kennedy, Beth T. – 1972
Issues related to the evaluation of instructional programs developed under the auspices of the Southwest Educational Development Laboratory are briefly discussed. The Laboratory develops criterion-referenced tests which form an integral part of each instructional program. The importance of examining the reliability and validity of these tests is…
Descriptors: Criterion Referenced Tests, Evaluation Methods, Instructional Programs, Test Reliability
Ellis, E. N. – 1975
Concern over the reading and writing programs in Vancouver, British Columbia Schools culminated in the establishment in June 1974 of a Task Force on English. In response to the request from the Task Force for a survey of the writing ability of Grade 11 students, a committee of English Department Heads assisted in developing an instrument and the…
Descriptors: Essay Tests, Grade 11, Scoring, Secondary Education
Peer reviewedNeill, John A.; Jackson, Douglas N. – Educational and Psychological Measurement, 1976
Illustrates a multivariate approach to item analysis. Previous formulation is extended by investigating techniques simultaneously taking into account scale variance with the goal of reducing the average correlation between scales. Study examines problems in determining optimum values for combinations of item parameters selected for personality…
Descriptors: Correlation, Factor Structure, Item Analysis, Personality Measures
Peer reviewedCarroll, C. Dennis – Educational and Psychological Measurement, 1976
A computer program for item evaluation, reliability estimation, and test scoring is described. The program contains a variable format procedure allowing flexible input of responses. Achievement tests and affective scales may be analyzed. (Author)
Descriptors: Achievement Tests, Affective Measures, Computer Programs, Item Analysis
Peer reviewedBrennan, Robert L. – Educational and Psychological Measurement, 1975
Variance components from split-plot factorial design (SPF) were used to estimate reliability for schools and persons within schools. Reliability for persons within SPF and randomized block design (RB) schools were compared and reliability for SPF and RB design schools were compared. (Author/BJG)
Descriptors: Analysis of Variance, Evaluation Methods, Schools, Statistical Analysis
Peer reviewedMehrabian, Albert; Hines, Melissa – Educational and Psychological Measurement, 1978
Reliability and validity data are reported for a questionnaire measure of individual differences in dominance-submissiveness. The 48-item questionnaire, which was balanced for response bias, had high internal consistency and correlated highly with other available measure of dominance. (Author/JKS)
Descriptors: Higher Education, Individual Characteristics, Personality Measures, Questionnaires
Peer reviewedHuynh, Huynh – Psychometrika, 1978
The use of Cohen's kappa index as a measure of the reliability of multiple classifications is developed. Special cases of the index as well as the effects of test length on the index are also explored. (JKS)
Descriptors: Career Development, Classification, Mastery Tests, Test Length
Peer reviewedKrashen, Stephen D. – Language Learning, 1978
Cites evidence showing that the "natural order" found using the Bilingual Syntax Measure to measure morpheme order is not an artifact of the test. (Author/AM)
Descriptors: Language Acquisition, Language Tests, Morphemes, Second Language Learning
Peer reviewedHuck, Schuyler W. – Educational and Psychological Measurement, 1978
A modification of Hoyt's analysis of variance model for test analysis was proposed by Lu. A difficulty that may be encountered in using Lu's modification is examined, and a solution is proposed. (JKS)
Descriptors: Analysis of Variance, Difficulty Level, Item Analysis, Test Items


