Publication Date
| In 2026 | 2 |
| Since 2025 | 441 |
| Since 2022 (last 5 years) | 1920 |
| Since 2017 (last 10 years) | 4492 |
| Since 2007 (last 20 years) | 6977 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 831 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 161 |
| Spain | 129 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 111 |
| Taiwan | 108 |
| Netherlands | 102 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
Peer reviewedYule, William; Rigley, Leslie V. – Journal of Research in Reading, 1982
Findings suggest that modestly good predictions can be made between IQ as measured by the Wechsler intelligence scales for children at age five and one-half and scores on group reading tests administered at ages seven and eight years. (FL)
Descriptors: Intelligence Tests, Predictive Validity, Primary Education, Reading Tests
Peer reviewedGallagher, Dolores; And Others – Journal of Consulting and Clinical Psychology, 1982
Reports three reliability coefficients for the Beck Depression Inventory using samples of elderly community volunteers and depressed outpatients. All three indexes were reasonably high in the total sample and fall within the accepted range of reliability for a clinical screening instrument. (Author)
Descriptors: Depression (Psychology), Diagnostic Tests, Measures (Individuals), Older Adults
Peer reviewedvan den Wollenberg, Arnold L. – Psychometrika, 1982
Presently available test statistics for the Rasch model are shown to be insensitive to violations of the assumption of test unidimensionality. Two new statistics are presented. One is similar to available statistics, but with some improvements; the other addresses the problem of insensitivity to unidimensionality. (Author/JKS)
Descriptors: Item Analysis, Latent Trait Theory, Statistics, Test Reliability
Peer reviewedBrown, Hilary S. R.; May, Arthur E. – Journal of Consulting and Clinical Psychology, 1979
The test-retest IQs of 50 patients were correlated. The patients were included in the sample only because they had been given the Wechsler Adult Intelligence Scale before. The interval between test and retest averaged almost two years. All test-retest correlations were .90 or better. (Author)
Descriptors: Correlation, Followup Studies, Foreign Countries, Intelligence Tests
Peer reviewedBurton, Nancy W. – Educational and Psychological Measurement, 1981
This study was concerned with selecting a measure of scorer agreement for use with the National Assessment of Educational Progress. The simple percent of agreement and Cohen's kappa were compared. It was concluded that Cohen's kappa does not add sufficient information to make its calculation worthwhile. (Author/BW)
Descriptors: Educational Assessment, Elementary Secondary Education, Quality Control, Scoring
Peer reviewedRaju, Nambury S. – Psychometrika, 1979
An important relationship is given for two generalizations of coefficient alpha: (1) Rajaratnam, Cronbach, and Gleser's generalizability formula for stratified-parallel tests, and (2) Raju's coefficient beta. (Author/CTM)
Descriptors: Item Analysis, Mathematical Formulas, Test Construction, Test Items
Peer reviewedBrennan, Robert L.; Prediger, Dale J. – Educational and Psychological Measurement, 1981
This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these statistics for measuring agreement with categorical data in studies of reliability and validity. (Author)
Descriptors: Classification, Error of Measurement, Mathematical Models, Test Reliability
Peer reviewedKlein, Alice E. – Educational and Psychological Measurement, 1980
The test-retest reliability and predictive validity of the Northwestern Syntax Screening Test (NSST) with pre-kindergarten pupils was investigated. It was found to have moderate test-retest reliability, and to be moderately accurate in predicting general academic achievement test scores of pupils in kindergarten and first grade. (Author/GK)
Descriptors: Academic Achievement, Predictive Validity, Preschool Children, Screening Tests
Peer reviewedCummins, R. Porter – Journal of Reading, 1981
Reviews the Nelson-Denny Reading Test (Forms E and F) and finds it an easy to use and valid norm-referenced survey test for determining the level of student reading achievement, assessing individual differences, and deriving group means. (AEA)
Descriptors: Evaluation Methods, Reading Achievement, Reading Tests, Test Reliability
Peer reviewedRogers, Dan L. – Perceptual and Motor Skills, 1980
To assess the utility and reliability of Bender test recall in children, 304 children (ages 5 through 14) were individually administered the copy and recall phases using Koppitz's directions. The recall phase was judged to be of doubtful utility in assessing intellectual functioning in children. (Author/SJL)
Descriptors: Age Differences, Children, Intelligence Tests, Recall (Psychology)
Peer reviewedFeldt, Leonard S. – Psychometrika, 1980
Procedures are developed for testing the hypothesis that Cronbach's alpha reliability coefficient is equal for two tests given to the same subjects. (Author/JKS)
Descriptors: Error of Measurement, Hypothesis Testing, Measurement, Statistical Significance
Peer reviewedWilcox, Rand R. – Educational and Psychological Measurement, 1979
The classical estimate of a binomial probability function is to estimate its mean in the usual manner and to substitute the results in the appropriate expression. Two alternative estimation procedures are described and examined. Emphasis is given to the single administration estimate of the mastery test reliability. (Author/CTM)
Descriptors: Cutting Scores, Mastery Tests, Probability, Scores
Peer reviewedCudeck, Robert – Journal of Educational Measurement, 1980
Methods for evaluating the consistency of responses to test items were compared. When a researcher is unwilling to make the assumptions of classical test theory, has only a small number of items, or is in a tailored testing context, Cliff's dominance indices may be useful. (Author/CTM)
Descriptors: Error Patterns, Item Analysis, Test Items, Test Reliability
Peer reviewedBrennan, Robert L.; Lockwood, Robert E. – Applied Psychological Measurement, 1980
Generalizability theory is used to characterize and quantify expected variance in cutting scores and to compare the Nedelsky and Angoff procedures for establishing a cutting score. Results suggest that the restricted nature of the Nedelsky (inferred) probability scale may limit its applicability in certain contexts. (Author/BW)
Descriptors: Cutting Scores, Generalization, Statistical Analysis, Test Reliability
Peer reviewedFox, Robert A. – Journal of School Health, 1980
Some practical guidelines for developing multiple choice tests are offered. Included are three steps: (1) test design; (2) proper construction of test items; and (3) item analysis and evaluation. (JMF)
Descriptors: Guidelines, Objective Tests, Planning, Test Construction


