Publication Date
| In 2026 | 0 |
| Since 2025 | 433 |
| Since 2022 (last 5 years) | 1911 |
| Since 2017 (last 10 years) | 4483 |
| Since 2007 (last 20 years) | 6968 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 830 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 159 |
| Spain | 129 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 111 |
| Taiwan | 108 |
| Netherlands | 102 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
PDF pending restorationKristof, Walter – 1973
This study in parametric test theory deals with the statistics of reliability estimation when scores on two parts of a test follow a binormal distribution with equal (case 1) or unequal (case 2) expectations. In each case biased maximum-likelihood estimators of reliability are obtained and converted into unbiased estimators. Sampling distributions…
Descriptors: Expectation, Research Reports, Sample Size, Sampling
Werts, Charles E.; Linn, Robert L. – 1972
Given multiple independent measures of an underlying true factor and information on group membership, it is possible to compute a set of observed group means for each measure. Given at least three tests, these sets of means may be used to compute the reliability of the means for each test. The procedure for estimating true scores from the…
Descriptors: Factor Analysis, Mathematical Models, Research, Research Reports
Reilly, Richard R.; Jackson, Rex – 1972
Item options of shortened forms of the Graduate Record Examination Verbal and Quantitative tests were empirically weighted by two variants of a method originally attributed to Guttman. The first method assigned to each option of an item the mean standard score on the remaining items of all subjects choosing that option. The second procedure…
Descriptors: Correlation, Factor Analysis, Graduate Study, Scoring
Mandeville, Garrett K.
Results of a comparative study of F and Q tests, in a randomized block design with one replication per cell, are presented. In addition to these two procedures, a multivariate test was also considered. The model and test statistics, data generation and parameter selection, results, summary and conclusions are presented. Ten tables contain the…
Descriptors: Comparative Analysis, Data Analysis, Mathematical Models, Models
Garvin, Alfred D.
Confidence weighting (CW) tends to improve the reliability of easy tests; the Coombs-type multiple-response (MR) option tends to improve the reliability of hard tests. It was hypothesized that, on a test of moderate difficulty, offering both the CW and MR response options would improve reliability more than either alone. Twenty-four subjects took…
Descriptors: Confidence Testing, Educational Testing, Multiple Choice Tests, Response Style (Tests)
Randall, Robert S. – 1972
Differences in design between norm referenced measures (NRM) and criterion referenced measures (CRM) are reviewed, and some of the procedures proposed on designing and evaluating CRM are examined. Differences in design of NRM and CRM are said to arise from the different purposes that underlie each measure. In addition, there are differences among…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Norm Referenced Tests, Test Construction
Ashlock, Patrick; Thompson, Glen – 1976
Information is presented on the validity, reliability, and normative data of the Ashlock Tests of Visual Perception-Revised, a screening test for perceptual difficulties in preschoolers. Discussed are implications for using the test for screening, diagnosis, and educational planning. (CL)
Descriptors: Learning Disabilities, Perceptual Handicaps, Preschool Education, Screening Tests
Christal, Raymond E.; Weissmuller, Johnny J. – 1975
Several new programs have been added to those of the Comprehensive Occupational Data Analysis Programs (CODAP), all oriented toward analyzing and manipulating information describing work tasks, rather than jobs or persons. REXALL analyzes the inter-rater agreement among judges concerning task-factor ratings. TSKFAC adds factor weight vectors to…
Descriptors: Computer Programs, Job Analysis, Occupational Information, Performance Factors
Willoughby, Lee; And Others – 1976
This study compared a domain referenced approach with a traditional psychometric approach in the construction of a test. Results of the December, 1975 Quarterly Profile Exam (QPE) administered to 400 examinees at a university were the source of data. The 400 item QPE is a five alternative multiple choice test of information a "safe"…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Norm Referenced Tests, Statistical Analysis
Nevo, Barukh – Measurement and Evaluation in Guidance, 1976
Freshmen (N=202) took two batteries of aptitude tests 10 months apart. Six pairs of tests were studied. Two pairs were identical, two were parallel, and two were completely different. This design made it possible to separate three components of practice: (a) general test sophistication, (b) specific practice effect, and (c) item familiarization.…
Descriptors: Aptitude Tests, College Freshmen, Comparative Analysis, Group Testing
Peer reviewedConger, Anthony J.; Conger, Judith Cohen – Educational and Psychological Measurement, 1975
Measures of multivariate reliability were calculated for profiles of Wechsler Intelligence Scale for Children (WISC) subscales on three age groups. Profile dimensions based on reliability considerations were established and matched across age groups and with factor analytic dimensions. WISC was found to measure general ability and verbal and…
Descriptors: Elementary Secondary Education, Individual Differences, Intelligence Tests, Profiles
Peer reviewedPerney, Jan – Educational and Psychological Measurement, 1975
The concurrent validity of the Student Opinion Inventory (SOI) factor scales was determined and the reproducibility of the reliability estimates found in pilot studies of the instrument examined. Responses to the SOI from students indicated that 5 of the 6 factor scales of the SOI possessed some concurrent validity. (Author/BJG)
Descriptors: Attitude Measures, Participant Satisfaction, Secondary Education, Student Attitudes
Peer reviewedPorter, Don – Language Learning, 1978
Reports on an experiment designed to test the reliability of the cloze procedure in second language testing, specifically as a measure of overall language proficiency, and as a measure whose results are independent of style. (AM)
Descriptors: Cloze Procedure, Language Proficiency, Language Styles, Language Tests
Peer reviewedRounds, James B., Jr.; And Others – Applied Psychological Measurement, 1978
Two studies compared multiple rank order and paired comparison methods in terms of psychometric characteristics and user reactions. Individual and group item responses, preference counts, and Thurstone normal transform scale values obtained by the multiple rank order method were found to be similar to those obtained by paired comparisons.…
Descriptors: Higher Education, Measurement, Rating Scales, Response Style (Tests)
Peer reviewedSerlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978
When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)
Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests


