NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 31 to 45 of 52 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Liang, Xin – Evaluation and Research in Education, 2003
Multiple matrix sampling is a data collection technique that ensures accuracy and efficiency in group performance. It has been widely used in large-scale curriculum evaluation since the 1980s. However, the design does not always fully embrace the dynamics of local evaluation demands. The purpose of this study is to introduce a modified matrix…
Descriptors: Curriculum Evaluation, Item Sampling, Matrices, Statistical Studies
Kriewall, Thomas E. – Illinois School Research, 1972
Author discusses and defines criterion tests in the context of classroom needs that have created much of the interest in the theory at this time. The primary source of interest is related to the growing implementation of individualized curricula. (Author/CB)
Descriptors: Criterion Referenced Tests, Difficulty Level, Individualized Instruction, Item Analysis
Kriewall, Thomas E.; Hirsch, Edward – 1969
As an alternative to a classical test theory basis for criterion-referenced test construction, it is proposed that a strict item-sampling model be used. The computer's role in such a model is outlined. The assumptions of the model are carefully defined and its properties reviewed. The relationship between mastery criteria and such sampling plans…
Descriptors: Arithmetic, Behavioral Objectives, Computer Assisted Instruction, Criterion Referenced Tests
Peer reviewed Peer reviewed
Lord, Frederic M. – Applied Psychological Measurement, 1977
Under given conditions, conventional testing and computer-generated repeatable testing (CGRT) are equally effective for estimating examinee ability; CGRT is more effective for estimating the mean ability level of a group and less effective for estimating ability differences among individuals. These conclusion are drawn from domain-referenced test…
Descriptors: Career Development, Computer Assisted Testing, Difficulty Level, Group Norms
Gifford, Janice A.; Hambleton, Ronald K. – 1980
Technical considerations associated with item selection and reliability assessment are considered in relation to criterion-referenced tests constructed to provide group information. The purpose is to emphasize test building and the evaluation of test scores in program evaluation studies. It is stressed that an evaluator employ a performance or…
Descriptors: Criterion Referenced Tests, Group Testing, Item Sampling, Models
Mislevy, Robert J.; And Others – 1982
An approach was developed based on item-response models defined at the level of salient subject groups rather than at the level of individuals, designed for use with multiple-matrix sampling designs. In each of three National Assessment of Educational Progress (NAEP) mathematics subtopics, Reiser's group-effects latent trait model was fitted to…
Descriptors: Educational Assessment, Item Analysis, Item Sampling, Latent Trait Theory
Pandey, Tej N. – 1978
The concept under investigation was the reliability of estimates of mean scores of groups under various assumptions of multiple-matrix sampling when reliabilities are computed according to procedures based on generalizability theory. Four different cases were compared with respect to the generalizability coefficients depending upon whether pupils…
Descriptors: Achievement Tests, Analysis of Variance, Basic Skills, Elementary Secondary Education
Harris, Chester W.; And Others – 1977
The implications of a mathematical model of test scores are explored where the data are limited to a random sample of items without replacement from an indefinitely large population or item domain in which items are scored either zero or one. The purpose is to obtain an unbiased estimate of a student's proportion of items correct in the item…
Descriptors: Academic Achievement, Achievement Tests, Annotated Bibliographies, Bibliographies
Carloni, John A.; Kolen, Michael J. – 1980
Generalizability theory was used to analyze the dependability of elementary school student ratings of attitudes toward school subjects. The rating scales under investigation have been developed to measure the attitudes of students toward four school subjects at both the primary and intermediate levels. Two generalizability coefficients, differing…
Descriptors: Attitude Measures, Comparative Analysis, Elementary Education, Elementary School Mathematics
Forster, Fred – 1987
Studies carried out over a 12-year period addressed fundamental questions on the use of Rasch-based item banks. Large field tests administered in grades 3-8 of reading, mathematics, and science items, as well as standardized test results were used to explore the possible effects of many factors on item calibrations. In general, the results…
Descriptors: Achievement Tests, Difficulty Level, Elementary Education, Item Analysis
PDF pending restoration PDF pending restoration
Kriewall, Thomas E. – 1972
The measurement information generated by CRT's is designed for use in instructional management systems where classifications of pupils for treatment are to be decided on the basis of minimal data consistent with predetermined limits for the errors of misclassification. The measures obtained are content specific estimates of proficiency useful for…
Descriptors: Ability Grouping, Academic Achievement, Criterion Referenced Tests, Decision Making
Brown, James Dean – 1983
This study attempted to determine the effectiveness of cloze procedures as norm-referenced instruments by comparing the differential responses of four groups of college students of English as a second language on two identical cloze passages. The responses were scored using both exact-answer and acceptable-word methods. The results indicate that…
Descriptors: Cloze Procedure, College Students, Comparative Analysis, English (Second Language)
Cook, Linda L.; And Others – 1987
This study tests several explanations for discrepant results in an earlier study (Cook et al., 1985) which presented a partial pre-calibration method for equating new editions of the Scholastic Aptitude Test (SAT) to the same scale as older editions. In contrast to full pre-calibration, which seeks to equate all items from two or more editions,…
Descriptors: College Entrance Examinations, Concurrent Validity, Equated Scores, Estimation (Mathematics)
Wilcox, Rand R. – 1979
Mastery tests are analyzed in terms of the number of skills to be mastered and the number of items per skill, in order that correct decisions of mastery or nonmastery will be made to a desired degree of probability. It is assumed that a random sample of skills will be selected for measurement, that each skill will be measured by the same number of…
Descriptors: Achievement Tests, Cutting Scores, Decision Making, Equivalency Tests
Gillmore, Gerald M. – 1979
It is argued in this paper that generalizability theory provides a uniquely useful framework for defining and quantifying the dependability of data for decision making. It does so by requiring careful specification of the conditions of measurement and the anticipated sources of variation in the results of the measurement procedure. A distinction…
Descriptors: Analysis of Variance, Criterion Referenced Tests, Decision Making, Educational Assessment
Pages: 1  |  2  |  3  |  4