NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)1
Since 2007 (last 20 years)3
Audience
Researchers1
Laws, Policies, & Programs
Education Consolidation…1
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Thomas, Christopher L.; Cassady, Jerrell C.; Finch, W. Holmes – Journal of Psychoeducational Assessment, 2018
The purpose of the current examination was to preliminarily suggest severity standards for the recently revised Cognitive Test Anxiety Scale-Second Edition (CTAS-2). Participants responded to the CTAS-2, Motivated Strategies for Learning Questionnaire (MSLQ), and FRIEDBEN Test Anxiety Scale. Using both latent class and cluster analyses, we were…
Descriptors: Cognitive Tests, Test Anxiety, Cutting Scores, Multivariate Analysis
Kaliski, Pamela; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna; Plake, Barbara; Reshetar, Rosemary – College Board, 2012
The Many-Facet Rasch (MFR) Model is traditionally used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR Model by examining the quality of ratings obtained from a…
Descriptors: Advanced Placement Programs, Achievement Tests, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Scherman, Vanessa; Howie, Sarah J.; Bosker, Roel J. – Educational Research and Evaluation, 2011
In information-rich environments, schools are often presented with a myriad of data from which decisions need to be made. The use of the information on a classroom level may be facilitated if performance could be described in terms of levels of proficiency or benchmarks. The aim of this article is to explore benchmarks using data from a monitoring…
Descriptors: Standard Setting, Foreign Countries, Grade 8, Ability
Reid, Jerry B. – 1985
This report investigates an area of uncertainty in using the Angoff method for setting standards, namely whether or not a judge's conceptualizations of borderline group performance are realistic. Ratings are usually made with reference to the performance of this hypothetical group, therefore the Angoff method's success is dependent on this point.…
Descriptors: Certification, Cutting Scores, Difficulty Level, Interrater Reliability
PDF pending restoration PDF pending restoration
Faggen, Jane; And Others – 1995
The objective of this study was to determine the degree to which recommendations for passing scores, calculated on the basis of a traditional standard-setting methodology, might be affected by the mode (paper versus computer-screen prints) in which test items were presented to standard setting panelists. Results were based on the judgments of 31…
Descriptors: Computer Assisted Testing, Cutting Scores, Difficulty Level, Evaluators
Peer reviewed Peer reviewed
Norcini, John; And Others – Applied Measurement in Education, 1994
Whether anchor item sets varying in difficulty and discrimination affect precision of cutting score equivalents generated through judge rescaling as much as equivalents from score equating was studied with 4 groups of experts and 250 and 1,000 examinees. Results indicate the robustness of judge rescaling and its superiority over equating. (SLD)
Descriptors: Cutting Scores, Decision Making, Difficulty Level, Equated Scores
Peer reviewed Peer reviewed
Chang, Lei; And Others – Applied Measurement in Education, 1996
The influence of judges' knowledge on standard setting for competency tests was studied with 17 judges who took an economics teacher certification test while setting competency standards using the Angoff procedure. Judges tended to set higher standards for items they answered correctly and lower standards for items they answered incorrectly. (SLD)
Descriptors: Competence, Difficulty Level, Economics, Judges
Johanson, George A.; Rich, Charles E. – 1991
Assigning letter grades in a consistent manner to tests in large classes across semesters is problematic if absolute grading standards are used. It may be unreasonable to implement the usual standard-setting approaches recommended for large-scale criterion-referenced testing due to both time constraints and a desire to have criteria that appear…
Descriptors: Class Size, College Students, Criterion Referenced Tests, Difficulty Level
Peer reviewed Peer reviewed
Grosse, Martin E.; Wright, Benjamin D. – Evaluation and the Health Professions, 1986
Based on the standard setting procedures or the American Board of Preventive Medicine for their Core Test, this article describes how Rasch measurement can facilitate using test content judgments in setting a standard. Rasch measurement can then be used to evaluate and improve the precision of the standard and to hold it constant across time.…
Descriptors: Certification, Criterion Referenced Tests, Difficulty Level, Health Personnel
Chang, Lei – 1996
It was hypothesized that, when compared to the Angoff method (W. H. Angoff, 1971), the Nedelsky method (L. Nedelsky, 1954) for standard setting had lower intrajudge inconsistency, lower cutscores, and lower cutscores especially for items presenting challenges to the judges. These hypotheses were tested and supported in a sample of 22 graduate…
Descriptors: Comparative Analysis, Cutting Scores, Difficulty Level, Distractors (Tests)
DeMauro, Gerald E. – 1995
Studies of the Angoff method of standard setting suggest that judges agree in their estimates of the relative difficulties of test questions for minimally competent examinees and that each judge's estimates correlate well with the observed item difficulties for examinees whose total test scores are near the judge's personal standard (G. E.…
Descriptors: Ability, Competence, Construct Validity, Difficulty Level
Peer reviewed Peer reviewed
Plake, Barbara S.; Melican, Gerald J. – Educational and Psychological Measurement, 1989
The impact of overall test length and difficulty on the expert judgments of item performance by the Nedelsky method were studied. Five university-level instructors predicting the performance of minimally competent candidates on a mathematics examination were fairly consistent in their assessments regardless of length or difficulty of the test.…
Descriptors: Difficulty Level, Estimation (Mathematics), Evaluators, Higher Education
Ziomek, Robert L.; Wright, Benjamin D. – 1984
Techniques such as the norm-referenced and average score techniques, commonly used in the identification of educationally disadvantaged students, are critiqued. This study applied latent trait theory, specifically the Rasch Model, along with teacher judgments relative to the mastery of instructional/test decisions, to derive a standard setting…
Descriptors: Cutting Scores, Difficulty Level, Educationally Disadvantaged, Intermediate Grades