NotesFAQContact Us
Collection
Advanced
Search Tips
Education Level
Audience
Practitioners1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 38 results Save | Export
Peer reviewed Peer reviewed
Spaulding, Cheryl L. – Journal of Reading, 1989
Reviews "Written Language Assessment" (WLA), a new standardized test to evaluate children's and adolescents' written language competence by having students write essays instead of answer multiple choice questions. Finds problems with the WLA in terms of interrater reliability. (RS)
Descriptors: Elementary Secondary Education, Essay Tests, Interrater Reliability, Standardized Tests
Peer reviewed Peer reviewed
Sawilowsky, Shlomo S. – Educational and Psychological Measurement, 2000
B. Thompson and T. Vacha-Haase have examined the statement "the reliability of the test" with emphasis on the following three words: (1) the first "the"; (2) "test"; and (3) the second "the." This discussion focuses instead on the word "reliability." (Author)
Descriptors: Generalization, Meta Analysis, Psychometrics, Reliability
Peer reviewed Peer reviewed
Collins, Linda M. – Applied Psychological Measurement, 1996
The clarification provided by Williams and Zimmerman on the reliability of gain scores is translated into recognizable patterns of change that tend to produce reliable or unreliable gain scores. The relevance of the traditional idea of reliability to the measurement of change is also discussed. (SLD)
Descriptors: Achievement Gains, Change, Measurement Techniques, Reliability
Peer reviewed Peer reviewed
Camilli, Gregory – Journal of Educational Measurement, 1999
Yen and Burket suggested that shrinkage in vertical equating cannot be understood apart from multidimensionality. Reviews research on reliability, multidimensionality, and scale shrinkage, and explores issues of practical importance to educators. (SLD)
Descriptors: Equated Scores, Error of Measurement, Item Response Theory, Reliability
Peer reviewed Peer reviewed
Arno, Kevin S. – Journal of Reading, 1990
Notes that the third edition of the Burns/Roe Informal Reading Inventory takes less time to administer. Reports an absence of data on the inventory's reliability. Concludes that if used to study, evaluate, or diagnose reading behaviors, the Burns and Roe IRI could be a popular and valuable tool. (RS)
Descriptors: Elementary Secondary Education, Informal Reading Inventories, Reading Diagnosis, Test Reliability
Peer reviewed Peer reviewed
Martin-Rehrmann, James – Journal of Reading, 1990
Reviews the fourth edition of the Analytic Reading Inventory (ARI). Notes the addition of suggestions for diagnostic interpretation and teacher interpretations. Finds the ARI to be a convenient yet reliable diagnostic tool. (RS)
Descriptors: Elementary Secondary Education, Informal Reading Inventories, Reading Diagnosis, Test Reliability
Peer reviewed Peer reviewed
Thompson, Bruce – Journal of Experimental Education, 2001
Asserts that editors should declare their expectations publicly and expose the rationale for editorial policies to public scrutiny. Supports effect size reporting and the reporting of score reliabilities. Argues against stepwise methods. Also discusses the interpretation of structure coefficients and the use of confidence intervals. (SLD)
Descriptors: Editing, Effect Size, Reliability, Research Methodology
Peer reviewed Peer reviewed
Knapp, Thomas R.; Sawilowsky, Shlomo S. – Journal of Experimental Education, 2001
Replies to Bruce Thompson's positions on research methodology and editorial policy, addressing each of these issues in the ongoing discussion: (1) structure coefficients; (2) stepwise regression; (3) test reliability; (4) effect sizes; and (5) meta-analysis. (SLD)
Descriptors: Editing, Effect Size, Reliability, Research Methodology
Peer reviewed Peer reviewed
Berry, Kenneth J.; Mielke, Paul W., Jr. – Educational and Psychological Measurement, 1997
Describes a FORTRAN software program that calculates the probability of an observed difference between agreement measures obtained from two independent sets of raters. An example illustrates the use of the DIFFER program in evaluating undergraduate essays. (Author/SLD)
Descriptors: Comparative Analysis, Computer Software, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
Humphreys, Lloyd G. – Applied Psychological Measurement, 1996
The reliability of a gain is determined by the reliabilities of the components, the correlation between them, and their standard deviations. Reliability is not inherently low, but the components of gains in many investigations make low reliability likely and require caution in the use of gain scores. (SLD)
Descriptors: Achievement Gains, Change, Correlation, Error of Measurement
Peer reviewed Peer reviewed
Williams, Richard H.; Zimmerman, Donald W. – Applied Psychological Measurement, 1996
The critiques by L. Collins and L. Humphreys in this issue illustrate problems with the use of gain scores. Collins' examples show that familiar formulas for the reliability of differences do not reflect the precision of measures of change. Additional examples demonstrate flaws in the conventional approach to reliability. (SLD)
Descriptors: Achievement Gains, Change, Correlation, Error of Measurement
Peer reviewed Peer reviewed
Luecht, Richard M. – Educational and Psychological Measurement, 1987
Test Pac, a test scoring and analysis computer program for moderate-sized sample designs using dichotomous response items, performs comprehensive item analyses and multiple reliability estimates. It also performs single-facet generalizability analysis of variance, single-parameter item response theory analyses, test score reporting, and computer…
Descriptors: Computer Assisted Testing, Computer Software, Computer Software Reviews, Item Analysis
Hoffman, Anne – 1997
The Ability Explorer (AE) is a newly developed self-report inventory of abilities that is appropriate for group or individual administration. There are machine-scorable and hand-scorable versions of the test, and there are two levels. Level 1 is for students from junior high to high school, and Level 2 is for high school students and adults.…
Descriptors: Ability, Adolescents, Adults, Aptitude Tests
Peer reviewed Peer reviewed
Mathewson, Grover C. – Reading Teacher, 1988
Concludes that the instrument reviewed is a carefully designed test incorporating a new interpretation of standardization and improved definitions of traditional reading levels. (FL)
Descriptors: Elementary Education, Reading Ability, Reading Instruction, Reading Tests
Peer reviewed Peer reviewed
Oakland, Thomas; And Others – Gifted Child Quarterly, 1996
Eleven leadership measures for children, youth, and adults are reviewed in the context of current leadership theories and psychometric standards for test use. Measures for assessing leadership among children are considered inadequately normed and lacking in reliability and validity data, but leadership measures for adults are seen as more…
Descriptors: Adolescents, Adults, Children, Gifted
Previous Page | Next Page ยป
Pages: 1  |  2  |  3