Publication Date
| In 2026 | 0 |
| Since 2025 | 6 |
| Since 2022 (last 5 years) | 26 |
| Since 2017 (last 10 years) | 108 |
| Since 2007 (last 20 years) | 302 |
Descriptor
| Comparative Analysis | 792 |
| Test Reliability | 792 |
| Test Validity | 425 |
| Foreign Countries | 174 |
| Test Construction | 132 |
| Correlation | 119 |
| Statistical Analysis | 117 |
| Scores | 106 |
| Higher Education | 98 |
| Psychometrics | 91 |
| Test Items | 89 |
| More ▼ | |
Source
Author
| Reckase, Mark D. | 5 |
| Bashaw, W. L. | 3 |
| Bennett, Randy Elliot | 3 |
| Benson, Jeri | 3 |
| Crehan, Kevin D. | 3 |
| Ebel, Robert L. | 3 |
| Frisbie, David A. | 3 |
| Hakstian, A. Ralph | 3 |
| Henk, William A. | 3 |
| Weiss, David J. | 3 |
| Winke, Paula | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 18 |
| Practitioners | 17 |
| Teachers | 9 |
| Administrators | 4 |
| Counselors | 2 |
| Policymakers | 2 |
| Parents | 1 |
| Support Staff | 1 |
Location
| United States | 21 |
| Turkey | 20 |
| Australia | 16 |
| China | 11 |
| United Kingdom (England) | 11 |
| Germany | 9 |
| Hong Kong | 9 |
| Iran | 9 |
| Taiwan | 9 |
| United Kingdom | 9 |
| Canada | 8 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedPandey, Tej N.; Hubert, Lawrence – Psychometrika, 1975
Use of Tukey's Jackknife in establishing a confidence interval around the population coefficient alpha is explored and the robustness of Feldt's procedure along with ten variants of the Jackknife when the data do not conform to the necessary normality requirements are evaluated. Only two of the variants compared to Feldt's approach. (RC)
Descriptors: Comparative Analysis, Measurement Techniques, Sampling, Statistical Bias
Pandey, Tej N.; Hubert, Lawrence J. – 1974
This investigation had two major purposes. The first was to explore the use of an inferential technique called Tukey's Jackknife in establishing a confidence interval about cooefficient alpha reliability. The second purpose was to study the robustness of the Feldt and the jackknife procedures when the data fails to satisfy usual normality…
Descriptors: Comparative Analysis, Item Sampling, Statistical Analysis, Statistics
Peer reviewedVollmerhausen, Susan; And Others – Journal of Clinical Psychology, 1986
Compared Kennedy and Elder's (1982) Wechsler Intelligence Scale for Children (Revised) regression model with Kaufman's (1976) linear equating model. Both the Kennedy and Elder, and the Kaufman abbreviated forms attained a high degree of association, suggesting that both models are equally effective. (Author/BL)
Descriptors: Comparative Analysis, Institutionalized Persons, Models, Special Education
Coscarelli, William; Shrock, Sharon – Performance Improvement Quarterly, 2002
Discusses problems in using traditional measures of reliability for criterion-referenced tests (CRTs) and describes two approaches to reliability for CRTs: estimates sensitive to all measures of error; and estimates of consistency in test outcome. Compares the two approaches and proposes recommendations for interpretation and use. (Author/LRW)
Descriptors: Comparative Analysis, Criterion Referenced Tests, Measurement Techniques, Test Reliability
Thrash, Susan K.; Porter, Andrew C. – 1974
The purpose of this paper is to prove that one currently recommended method of obtaining the reliability of an instrument defined on a population of aggregate units is invalid. This method randomly splits the aggregate into two halves, correlates the two half unit scores by a Pearson product moment correlation coefficient, and corrects the…
Descriptors: Comparative Analysis, Correlation, Measurement Techniques, Sampling
Thostenson, Marvin S. – 1966
This investigation dealt with the development and evaluation of both a music dictation test (PRM78 Dictation Test) and a sightsinging test (CSS76 Criterion Sightsinging Test). It was hoped that the dictation test could eventually be developed to serve as an adequate replacement for the latter. Thirteen samples participated in this project--7…
Descriptors: Auditory Training, Comparative Analysis, Music Reading, Statistical Analysis
Kleinke, David J. – 1976
Data from 200 college-level tests were used to compare three reliability approximations (two of Saupe and one of Cureton) to Kuder-Richardson Formula 20 (KR20). While the approximations correlated highly (about .9) with the reliability estimate, they tended to be underapproximations. The explanation lies in an apparent bias of Lord's approximation…
Descriptors: Comparative Analysis, Correlation, Error of Measurement, Statistical Analysis
Peer reviewedMartois, John S. – Educational and Psychological Measurement, 1973
Copies of this program may be obtained from the author at the University of Southern California, School of Pharmacy, University Park, Los Angeles 90007. (CB)
Descriptors: Comparative Analysis, Computer Programs, Input Output, Statistical Analysis
Peer reviewedLord, Frederic M. – Journal of Educational Measurement, 1974
When comparing two tests that measure the same trait, separate comparisons should be made at different levels of the trait. A simple, practical, approximate formula is given for doing this. The adequacy of the approximation is illustrated using data comparing seven nationally known sixth-grade reading tests. (Author/RC)
Descriptors: Ability Identification, Comparative Analysis, Reading Tests, Statistical Analysis
Manpower Administration (DOL), Washington, DC. U.S. Training and Employment Service. – 1969
To compare the reliability of performance on recorded dictation tests with performance on live tests, 216 university students who were nearing completion of an intermediate shorthand course and 26 job applicants seeking stenographic positions were divided into 10 groups, with five receiving live dictation and five receiving recorded dictation. The…
Descriptors: Comparative Analysis, Comparative Testing, Evaluation, Performance Tests
Peer reviewedJackson, Paul H.; Agunwamba, Christian C. – Psychometrika, 1977
Finding and interpreting lower bounds for reliability coefficients for tests with nonhomogenous items has been a problem for psychometricians. This paper presents a mathematical formula for finding the greatest lower bound for such a coefficient. (Author/JKS)
Descriptors: Comparative Analysis, Mathematical Models, Measurement, Test Interpretation
Peer reviewedQuereshi, M. Y.; Ostrowski, Michael J. – Journal of Clinical Psychology, 1985
Administered three Wechsler adult intelligence scales to 72 undergraduates and tested the quality of means, variances, and covariances, utilizing subtest scale scores and IQs. Results indicated that the three scales were not parallel. Generally, the subtest scaled scores exhibited less similarity across the three scales than the IQ estimates.…
Descriptors: College Students, Comparative Analysis, Higher Education, Intelligence Tests
Michaelides, Michalis P.; Haertel, Edward H. – Center for Research on Evaluation Standards and Student Testing CRESST, 2004
There is variability in the estimation of an equating transformation because common-item parameters are obtained from responses of samples of examinees. The most commonly used standard error of equating quantifies this source of sampling error, which decreases as the sample size of examinees used to derive the transformation increases. In a…
Descriptors: Test Items, Testing, Error Patterns, Interrater Reliability
Peer reviewedCrehan, Kevin D.; Slakter, Malcolm J. – Psychological Reports, 1971
Descriptors: Comparative Analysis, Multiple Choice Tests, Test Construction, Test Reliability
Hambleton, Ronald K.; And Others – J Educ Meas, 1970
Descriptors: Comparative Analysis, Evaluation Methods, Multiple Choice Tests, Test Reliability


