NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Job Training Partnership Act…1
What Works Clearinghouse Rating
Showing 91 to 105 of 133 results Save | Export
Saunders, Joseph C.; Huynh, Huynh – 1980
In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests
Rodriguez-Aragon, Graciela; And Others – 1993
The predictive power of the Split-Half version of the Wechsler Intelligence Scale for Children--Revised (WISC-R) Object Assembly (OA) subtest was compared to that of the full administration of the OA subtest. A cohort of 218 male and 49 female adolescent offenders detained in a Texas juvenile detention facility between 1990 and 1992 was used. The…
Descriptors: Adolescents, Cohort Analysis, Comparative Testing, Correlation
Peer reviewed Peer reviewed
Harrington, Robert G.; Jennings, Valerie – Contemporary Educational Psychology, 1986
Three short forms of the McCarthy Scales of Children's Abilities (MSCA) have been developed to screen the cognitive skills of young children suspected of learning disorders and developmental delays. Correlations were obtained between scores on the full form of the MSCA and the Kaufman, Taylor, and McCarthy Screening Test short forms. (Author/LMO)
Descriptors: Cognitive Tests, Comparative Testing, Correlation, Early Childhood Education
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004
As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…
Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement
Mills, Craig N.; Simon, Robert – 1981
When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…
Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement
Schedl, Mary; And Others – 1995
The Test of English as a Foreign Language (TOEFL) program is exploring a change in Section 3 of the TOEFL test that would replace the vocabulary subpart with additional reading comprehension questions. This study investigated the proposed revision in terms of the length and timing that would be necessary to address concerns of test speededness of…
Descriptors: Adult Students, English (Second Language), Language Tests, Psychometrics
Reckase, Mark D. – 1981
This report describes a study comparing the classification results obtained from a one-parameter and three-parameter logistic based tailored testing procedure used in conjunction with Wald's sequential probability ratio test (SPRT). Eighty-eight college students were classified into four grade categories using achievement test results obtained…
Descriptors: Adaptive Testing, Classification, Comparative Analysis, Computer Assisted Testing
Byars, Alvin Gregg – 1980
The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…
Descriptors: Cutting Scores, Difficulty Level, Grade 4, Intermediate Grades
Catts, Ralph – 1978
The reliability of multiple choice tests--containing different numbers of response options--was investigated for 260 students enrolled in technical college economics courses. Four test forms, constructed from previously used four-option items, were administered, consisting of (1) 60 two-option items--two distractors randomly discarded; (2) 40…
Descriptors: Answer Sheets, Difficulty Level, Foreign Countries, Higher Education
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Applied Measurement in Education, 2006
In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…
Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory
Mitchell, Karen J.; Anderson, Judith A. – 1987
The Association of American Medical Colleges is conducting research to develop, implement, and evaluate a Medical College Admission Test (MCAT) essay testing program. Essay administration in the spring and fall of 1985 and 1986 suggested that additional research was needed on the development of topics which elicit similar skills and meet standard…
Descriptors: College Entrance Examinations, Essay Tests, Estimation (Mathematics), Generalizability Theory
Metropolitan Atlanta Consortium of Consultants and Lead Speech-Language Pathologists, GA. – 1990
This guide presents ratings of assessment instruments for use by speech-language pathologists with preschool students. Tests are reviewed in alphabetical order on forms filled out by practicing speech-language pathologists, including data on speech components covered by each test, age range, factors of norms where norms are used, reliability,…
Descriptors: Diagnostic Tests, Examiners, Preschool Education, Preschool Tests
de Jong, John H. A. L. – 1984
The Netherlands' secondary education system is highly differentiated, with four different school types for four scholastic ability levels. Final examinations must accommodate these four levels, and require a test-independent definition of the intended final ability levels as well as a sample-free evaluation of the range of ability levels at which…
Descriptors: Difficulty Level, Efficiency, Equated Scores, Foreign Countries
Oosterhof, Albert C.; Coats, Pamela K. – 1981
Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…
Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education
Eignor, Daniel R.; Hambleton, Ronald K. – 1979
The purpose of the investigation was to obtain some relationships among (1) test lengths, (2) shape of domain-score distributions, (3) advancement scores, and (4) several criterion-referenced test score reliability and validity indices. The study was conducted using computer simulation methods. The values of variables under study were set to be…
Descriptors: Comparative Analysis, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9