ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	8

Descriptor

Probability	13
Test Items	13
Standard Setting (Scoring)	10
Cutting Scores	6
Difficulty Level	5
Criterion Referenced Tests	3
Interrater Reliability	3
Scoring	3
Standard Setting	3
Error of Measurement	2
Generalizability Theory	2
Item Analysis	2
Item Response Theory	2
Judges	2
Reading Tests	2
State Standards	2
Ability	1
Case Studies	1
College Faculty	1
College Students	1
Comparative Analysis	1
Comparative Testing	1
Competence	1
Computation	1
Content Analysis	1
More ▼

Source

Educational and Psychological…	3
Applied Measurement in…	2
Applied Psychological…	1
Educational Sciences: Theory…	1
Higher Education Studies	1
Journal of Educational…	1
Practical Assessment,…	1

Publication Type

Journal Articles	10
Reports - Research	9
Reports - Evaluative	2
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Education	1
Grade 11	1
Grade 5	1
Grade 8	1

Audience

Practitioners

Location

United Kingdom

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Method for Detecting Regression of Hard and Easy Item Angoff Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2019

One common phenomenon in Angoff standard setting is that panelists regress their ratings in toward the middle of the probability scale. This study describes two indices based on taking ratios of standard deviations that can be utilized with a scatterplot of item ratings versus expected probabilities of success to identify whether ratings are…

Descriptors: Item Analysis, Standard Setting, Probability, Feedback (Response)

Getting Lucky: How Guessing Threatens the Validity of Performance Classifications

Peer reviewed
PDF on ERIC

Download full text

Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016

There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…

Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Using Student Ability and Item Difficulty for Making Defensible Pass/Fail Decisions for Borderline Grades

Peer reviewed
PDF on ERIC

Download full text

Shulruf, Boaz; Jones, Phil; Turner, Rolf – Higher Education Studies, 2015

The determination of Pass/Fail decisions over Borderline grades, (i.e., grades which do not clearly distinguish between the competent and incompetent examinees) has been an ongoing challenge for academic institutions. This study utilises the Objective Borderline Method (OBM) to determine examinee ability and item difficulty, and from that…

Descriptors: Undergraduate Students, Pass Fail Grading, Decision Making, Probability

A Comparison of Bookmark and Angoff Standard Setting Methods

Peer reviewed
PDF on ERIC

Download full text

Çetin, Sevda; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2013

In this research, the cut score of a foundation university was re-calculated with bookmark method and with Angoff method, each of which is a standard setting method; and the cut scores found were compared with the current proficiency score. Thus, the final cut score was found to be 27.87 with the cooperative work of 17 experts through the Angoff…

Descriptors: Standard Setting (Scoring), Comparative Analysis, Cutting Scores, Correlation

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2011

Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…

Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

The Use of Subsets of Test Questions in an Angoff Standard-Setting Method

Peer reviewed

Direct link

Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2005

In an Angoff standard-setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item constituting the test. In many cases, these item performance estimates are made twice, with information shared with the judges between estimates. Especially for long tests,…

Descriptors: Test Items, Probability, Standard Setting (Scoring)

Item Selection Strategy for Reducing the Number of Items Rated in an Angoff Standard Setting Study

Peer reviewed

Direct link

Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007

In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…

Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)

Detecting Intrajudge Inconsistency in Standard Setting Using Test Items with a Selected-Response Format. Research Report.

Download full text

van der Linden, Wim J.; Vos, Hans J.; Chang, Lei – 2000

In judgmental standard setting experiments, it may be difficult to specify subjective probabilities that adequately take the properties of the items into account. As a result, these probabilities are not consistent with each other in the sense that they do not refer to the same borderline level of performance. Methods to check standard setting…

Descriptors: Interrater Reliability, Judges, Probability, Standard Setting

Comparison of Bookmark Difficulty Locations Under Different Item Response Models

Peer reviewed

Direct link

Beretvas, S. Natasha – Applied Psychological Measurement, 2004

In the bookmark standard-setting procedure, judges place "bookmarks" in a reordered test booklet containing items presented in order of increasing difficulty. Traditionally, the bookmark difficulty location (BDL) is on the trait continuum where, for dichotomous items, there is a two-thirds probability of a correct response and, for a score of "k"…

Descriptors: Probability, Standard Setting, Item Response Theory, Test Items

An Empirical Comparison of Judgmental Approaches to Standard Setting Procedures.

Download full text

Rock, D. A.; And Others – 1980

An experiment was designed that varied cutting score procedures, instructions, and types of judges in order to address the following questions concerning the Real Estate Licensing Examination: (1) Will the cutting score levels produced by groups of judges from differing backgrounds (academicians vs. practitioners vs. lawyers) using the same method…

Descriptors: Competence, Content Analysis, Criterion Referenced Tests, Cutting Scores

Assessing Inconsistencies in Standard Setting with the Angoff or Nedelsky Technique.

Download full text

van der Linden, Wim J. – 1982

A latent trait method is presented to investigate the possibility that Angoff or Nedelsky judges specify inconsistent probabilities in standard setting techniques for objectives-based instructional programs. It is suggested that judges frequently specify a low probability of success for an easy item but a large probability for a hard item. The…

Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Interrater Reliability

Ferdous, Abdullah A.	2
Plake, Barbara S.	2
Wyse, Adam E.	2
van der Linden, Wim J.	2
Babcock, Ben	1
Beretvas, S. Natasha	1
Chang, Lei	1
Chis, Liliana	1
Clauser, Brian E.	1
Foley, Brett P.	1
Gelbal, Selahattin	1
Harik, Polina	1
Jones, Phil	1
Kannan, Priya	1
Katz, Irvin R.	1
Margolis, Melissa J.	1
McManus, I. C.	1
Mollon, Jennifer	1
Rock, D. A.	1
Sgammato, Adrienne	1
Shulruf, Boaz	1
Tannenbaum, Richard J.	1
Turner, Rolf	1
Vos, Hans J.	1
More ▼