NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 4 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Davis-Becker, Susan L.; Buckendahl, Chad W. – International Journal of Testing, 2013
A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…
Descriptors: Standard Setting (Scoring), Evidence, Validity, Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Gad S.; Geranpayeh, Ardeshir; Khalifa, Hanan; Buckendahl, Chad W. – International Journal of Testing, 2013
Standard setting theory has largely developed with reference to a typical situation, determining a level or levels of performance for one exam for one context. However, standard setting is now being used with international reference frameworks, where some parameters and assumptions of classical standard setting do not hold. We consider the…
Descriptors: Standard Setting (Scoring), Validity, Models, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Brian E.; Mee, Janet; Margolis, Melissa J. – International Journal of Testing, 2013
This study investigated the extent to which the performance data format impacted data use in Angoff standard setting exercises. Judges from two standard settings (a total of five panels) were randomly assigned to one of two groups. The full-data group received two types of data: (1) the proportion of examinees selecting each option and (2) plots…
Descriptors: Standard Setting (Scoring), Cutting Scores, Validity, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Stone, Gregory Ethan; Beltyukova, Svetlana; Fox, Christine M. – International Journal of Testing, 2008
Judge-mediated examinations are defined as those for which expert evaluation (using rubrics) is required to determine correctness, completeness, and reasonability of test-taker responses. The use of multifaceted Rasch modeling has led to improvements in the reliability of scoring such examinations. The establishment of criterion-referenced…
Descriptors: Interrater Reliability, High Stakes Tests, Standard Setting, Minimum Competencies