Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 23 |
| Since 2017 (last 10 years) | 563 |
| Since 2007 (last 20 years) | 1786 |
Descriptor
| Statistical Analysis | 2533 |
| Reliability | 1278 |
| Test Reliability | 1074 |
| Foreign Countries | 940 |
| Correlation | 633 |
| Test Validity | 630 |
| Factor Analysis | 559 |
| Validity | 508 |
| Questionnaires | 479 |
| Measures (Individuals) | 411 |
| Test Construction | 338 |
| More ▼ | |
Source
Author
| Alonzo, Julie | 12 |
| Price, Gary G. | 12 |
| Tindal, Gerald | 10 |
| Lai, Cheng-Fei | 9 |
| Brennan, Robert L. | 8 |
| Raykov, Tenko | 8 |
| Feldt, Leonard S. | 7 |
| Livingston, Samuel A. | 7 |
| Park, Bitnara Jasmine | 7 |
| Irvin, P. Shawn | 6 |
| Anderson, Daniel | 5 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 34 |
| Practitioners | 21 |
| Teachers | 10 |
| Students | 8 |
| Administrators | 5 |
| Counselors | 2 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Turkey | 204 |
| Nigeria | 57 |
| Jordan | 38 |
| Australia | 35 |
| Iran | 35 |
| Taiwan | 35 |
| Canada | 31 |
| China | 30 |
| Germany | 29 |
| California | 28 |
| United Kingdom | 25 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Feldt, Leonard S. – 1983
This paper considers, from a theoretical point of view, two measurement approaches used in measuring success and failure in skills tests in physical education. The first, "fixed length" (FL) testing, entails counting the number of successful performances in a fixed number of trials. The second, "trials-to-criterion" (TTC)…
Descriptors: Evaluation Methods, Mathematical Formulas, Mathematical Models, Measurement Techniques
Reckase, Mark D. – 1978
Five comparisons were made relative to the quality of estimates of ability parameters and item calibrations obtained from the one-parameter and three-parameter logistic models. The results indicate: (1) The three-parameter model fit the test data better in all cases than did the one-parameter model. For simulation data sets, multi-factor data were…
Descriptors: Comparative Analysis, Goodness of Fit, Item Analysis, Mathematical Models
Raffeld, Paul; Reynolds, William M. – 1977
The pretest-posttest design referred to as Design 2 by Campbell and Stanley (1963) is commonly used in educational research and evaluation. The tenability of the assumption of a zero population difference commonly used with this design is questioned. A nonzero population estimate based on the mean difference observed in test-retest reliability…
Descriptors: Control Groups, Correlation, Experimental Groups, Hypothesis Testing
PDF pending restorationEstes, Carole; Estes, Gary D. – 1980
Multiple matrix sampling is a sampling design in which both test items and examinees are randomly sampled from their respective populations. This study was designed to develop and assess a method for computing an estimate of a correlation coefficient when a multiple matrix sampling design is used. The examinee populations included 212 third-grade…
Descriptors: Correlation, Elementary Secondary Education, Evaluation Methods, Grade 3
SAFFORD, PHILIP L. – 1967
THE RELATIVE EFFECTIVENESS OF TASK SCORES VERSUS IQ AS PREDICTORS OF ACADEMIC ACHIEVEMENT WAS INVESTIGATED, AND THE CORRELATIONS BETWEEN TASK SCORES AND IQ RE-EXAMINED. SUBJECTS WERE 99 UPPER-MIDDLE CLASS ELEMENTARY SCHOOL CHILDREN WITH A MEAN STANFORD-BINET IQ OF 126 (SD EQUALS 19). THE INSTRUMENTS USED WERE DUNN'S OBJECT SORTING TASK (OST),…
Descriptors: Academic Achievement, Cognitive Tests, Creativity, Elementary Education
BROWN, BOB BURTON; AND OTHERS – 1967
THIS PORTION OF AN "INVESTIGATION OF OBSERVER-JUDGE RATINGS OF TEACHER COMPETENCE" WAS PRIMARILY DEVOTED TO STATISTICAL ISSUES IN ASSESSING THE RELIABILITY OF OBSERVATIONS OF TEACHERS' CLASSROOM BEHAVIOR. FROM 67 TO 130 STUDENT TEACHING SUPERVISORS, ACADEMIC PROFESSORS, AND EDUCATION PROFESSORS FROM TWO LARGE MIDWESTERN UNIVERSITIES AND…
Descriptors: Behavior Rating Scales, Educational Philosophy, Films, Lesson Observation Criteria
Drummond, Robert J.; And Others – 1975
The Children's Interaction Matrix, Intermediate and Primary Forms, are designed to identify the preferred work and content styles of children in group situations. These factors aid the researcher, teacher, and counselor in understanding the individual's preferred mode of behavior in groups as well as indicating the students' reaction to group…
Descriptors: Elementary Education, Elementary School Students, Factor Analysis, Group Behavior
Brennan, Robert L.; Kane, Michael F. – 1975
When classes are the units of analyses, estimates of the reliability of class means are needed. Using classical test theory it is difficult to treat this problem adequately. Generalizability theory, however, provides a natural framework for dealing with the problem. Each of four possible formulas for the generalizability of class means is derived…
Descriptors: Analysis of Variance, Classes (Groups of Students), Correlation, Error Patterns
ACTION, Washington, DC. – 1975
The study presents statistics in verbal, graphic, and tabular form based on three different population sets: the population as a whole, the volunteer population during the year ending in April 1974, and the volunteer population during the week of April 7-13, 1974. The most typical volunteer was a married white woman between ages 25 and 44 who held…
Descriptors: Activities, Individual Characteristics, Motivation, National Surveys
PDF pending restorationFuller, Edward, – 1973
This self-instructional manual for psychological assessment focuses on the following topics: (1) general statistics, (2) central tendency, (3) random, continuous, and discrete variables, (4) variability, (5) measuring variability, (6) sampling, (7) derived scores, (8) covariation, (9) reliability and validity, and (10) standard error of…
Descriptors: Autoinstructional Aids, Correlation, Error of Measurement, Guides
Ree, Malcolm James – 1976
A method for developing statistically parallel tests based on the analysis of unique item variance was developed. A test population of 907 basic airmen trainees were required to estimate the angle at which an object in a photograph was viewed, selecting from eight possibilities. A FORTRAN program known as VARSEL was used to rank all the test items…
Descriptors: Comparative Analysis, Computer Programs, Enlisted Personnel, Item Analysis
Gleser, Leon Jay – 1971
An attempt is made to indicate why the concept of "true score" naturally leads to the belief that test validity must increase with an increase in test and/or average item reliability, and why this is correct for the classical single-factor model first introduced by Spearman. The statistical model used by Loevinger is introduced to…
Descriptors: Factor Analysis, Item Analysis, Mathematical Models, Measurement Techniques
American Occupational Therapy Association, Rockville, MD. – 1976
This report presents the final scope, methods, and results of the evaluation of proficiency measures in occupational therapy. The intended purpose of the investigation was to evaluate and analyze the reliability and validity of measurements that are predictive of competence and proficiency at entry levels in occupational therapy. Each level of the…
Descriptors: Certification, Equivalency Tests, Evaluation, Factor Analysis
Peer reviewedBonzi, Susan – Journal of Documentation, 1984
Tested the hypothesis that the vocabulary of a discipline emphasizing concrete phenomena will have fewer synonyms per concept than vocabulary of a discipline emphasizing abstract phenomena. Although concreteness and abstractness of a discipline were found to be contributing factors in terminological consistency, at least one other factor exerts…
Descriptors: Abstracts, Behavioral Sciences, Biological Sciences, Intellectual Disciplines
Peer reviewedFrary, Robert B. – Journal of Educational Measurement, 1985
Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)
Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)


