NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED442861
Record Type: Non-Journal
Publication Date: 2000-Apr-25
Pages: 17
Abstractor: N/A
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Establishing the Reliability of Student Proficiency Classifications: The Accuracy of Observed Classifications.
Hoffman, R. Gene; Wise, Lauress L.
Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is error. This paper focuses on accuracy as a function of particular observed scores, questioning whether a student's unknown true score is likely to be in the same category as the student's observed score. A limited set of test items was retrieved from a state-wide examination. Sixteen multiple-choice mathematics items for 3,000 students were scaled using the three-parameter logistic option. The primary conclusion from this study is that classification accuracy functions based on observed scores look quite different from accuracy functions based on true scores. For some of the observed scores, the most likely true score is an adjacent classification category. A further exploration considered how observed scores are placed on the true score scale and whether using the same cut-points for true and observed scores is the best approach. The overall conclusion is that there is no way, short of a perfectly reliable test, of simultaneously maximizing observed score classification accuracy and the accuracy with which overall population distributions are estimated. Nonetheless, observed score classification accuracy curves do provide information about individual observed scores that is quite useful. These curves also provide a way of illustrating the consequences of particular decisions about the scaling and equating of performance category subscores. An appendix contains a visual depiction of the probability computations from the study. (SLD)
Publication Type: Numerical/Quantitative Data; Reports - Research; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: Kentucky State Dept. of Education, Frankfort.
Authoring Institution: Human Resources Research Organization, Alexandria, VA.
Grant or Contract Numbers: N/A
Author Affiliations: N/A