ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	20

Descriptor

Difficulty Level	42
Standard Setting (Scoring)	42
Test Items	33
Cutting Scores	27
Interrater Reliability	11
Standards	10
Item Response Theory	9
Licensing Examinations…	9
Minimum Competency Testing	9
Evaluators	7
Higher Education	7
Scoring	7
Foreign Countries	6
Item Analysis	6
Certification	5
Comparative Analysis	5
Criterion Referenced Tests	5
Error of Measurement	5
Judges	5
Science Tests	5
Correlation	4
Equated Scores	4
Estimation (Mathematics)	4
Language Tests	4
Mathematics Tests	4
More ▼

Source

Applied Measurement in…	5
Educational Measurement:…	4
Educational and Psychological…	4
International Journal of…	2
Assessment & Evaluation in…	1
Assessment in Education:…	1
Educational Sciences: Theory…	1
Evaluation and the Health…	1
Higher Education Studies	1
International Education…	1
International Journal of…	1
Journal of Educational…	1
National Center for Research…	1
Online Submission	1
ProQuest LLC	1
Research Matters	1
Research Papers in Education	1
More ▼

Publication Type

Reports - Research	32
Journal Articles	25
Speeches/Meeting Papers	11
Reports - Evaluative	9
Dissertations/Theses -…	1

Education Level

Higher Education	5
Postsecondary Education	5
Secondary Education	5
Grade 3	2
Grade 5	2
Elementary Education	1
Elementary Secondary Education	1
Grade 11	1
Grade 7	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Researchers

Location

United Kingdom	4
New Jersey	2
California	1
Germany	1
Malaysia	1

Laws, Policies, & Programs

Education Consolidation…

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing 1 to 15 of 42 results Save | Export

Embedding Embedded Standard Setting: An Application of Cross-Classified Item Response Theory. CRESST Report 876

Download full text

Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025

This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…

Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Comparison of Passing Scores Determined by the Angoff Method in Different Item Samples

Peer reviewed
PDF on ERIC

Download full text

Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020

In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…

Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Regression Effects in Angoff Ratings: Examples from Credentialing Exams

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2018

This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…

Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)

Development of Objective Standard Setting Using Rasch Measurement Model in Malaysian Institution of Higher Learning

Peer reviewed
PDF on ERIC

Download full text

Khatimin, Nuraini; Aziz, Azrilah Abdul; Zaharim, Azami; Yasin, Siti Hanani Mat – International Education Studies, 2013

Measurement and evaluation of students' achievement are an important aspect to make sure that students really understand the course content and monitor students' achievement level. Performance is not only reflected from the numbers of high achievers of the students, but also on quality of the grade obtained; does the grade "A" truly…

Descriptors: Standard Setting, Item Response Theory, Measurement Objectives, Measurement Techniques

Using Student Ability and Item Difficulty for Making Defensible Pass/Fail Decisions for Borderline Grades

Peer reviewed
PDF on ERIC

Download full text

Shulruf, Boaz; Jones, Phil; Turner, Rolf – Higher Education Studies, 2015

The determination of Pass/Fail decisions over Borderline grades, (i.e., grades which do not clearly distinguish between the competent and incompetent examinees) has been an ongoing challenge for academic institutions. This study utilises the Objective Borderline Method (OBM) to determine examinee ability and item difficulty, and from that…

Descriptors: Undergraduate Students, Pass Fail Grading, Decision Making, Probability

The Impact of Social Comparison on the Judgment-Based Angoff Method

Direct link

Sorensen, Henry L. – ProQuest LLC, 2013

Cut-score setting processes are used to establish the passing standards for all kinds of tests in education and for credentialing. While experts use their best efforts to guide cut-score setting processes to generate valid and reliable results, cut-score participants often have a difficult time understanding the standard at which the cut score is…

Descriptors: Cutting Scores, Standard Setting (Scoring), Comparative Analysis, Difficulty Level

Assessing the Viability of External Searchable Resources on the American Board of Family Medicine's Certification Examination

Download full text

O'Neill, Thomas R.; Peabody, Michael R.; Stelter, Keith L.; Hagen, Michael D. – Online Submission, 2015

(Purpose) The purpose of our study was to assess the need for an external searchable resource to be used in conjunction with the American Board of Family Medicine's (ABFM) Maintenance of Certification for Family Physicians (MC-FP) Examination, discuss the philosophical question of whether an ESR should be allowed on the examination, and outline…

Descriptors: Licensing Examinations (Professions), Family Practice (Medicine), Physicians, Online Searching

A Comparison of Bookmark and Angoff Standard Setting Methods

Peer reviewed
PDF on ERIC

Download full text

Çetin, Sevda; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2013

In this research, the cut score of a foundation university was re-calculated with bookmark method and with Angoff method, each of which is a standard setting method; and the cut scores found were compared with the current proficiency score. Thus, the final cut score was found to be 27.87 with the cooperative work of 17 experts through the Angoff…

Descriptors: Standard Setting (Scoring), Comparative Analysis, Cutting Scores, Correlation

Setting and Maintaining GCSE and GCE Grading Standards: The Case for Contextualised Cohort-Referencing

Peer reviewed

Direct link

Stringer, Neil Simon – Research Papers in Education, 2012

General Certificate of Secondary Education (GCSE) and General Certificate of Education (GCE) grading standards are determined by Awarding Bodies using procedures that adhere to the Code of Practice published by the regulator, Ofqual. Grade boundary marks (cut scores) are set using subject experts' (senior examiners) judgement of the quality of…

Descriptors: Foreign Countries, Secondary Education, Exit Examinations, Grading

Applying Rasch Model and Generalizability Theory to Study Modified-Angoff Cut Scores

Peer reviewed

Direct link

Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012

The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…

Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling

Evaluating the Bookmark Standard Setting Method: The Impact of Random Item Ordering

Peer reviewed

Direct link

Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011

Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…

Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)

Setting Standards for English Foreign Language Assessment: Methodology, Validation, and a Degree of Arbitrariness

Peer reviewed

Direct link

Tiffin-Richards, Simon P.; Pant, Hans Anand; Koller, Olaf – Educational Measurement: Issues and Practice, 2013

Cut-scores were set by expert judges on assessments of reading and listening comprehension of English as a foreign language (EFL), using the bookmark standard-setting method to differentiate proficiency levels defined by the Common European Framework of Reference (CEFR). Assessments contained stratified item samples drawn from extensive item…

Descriptors: Foreign Countries, English (Second Language), Language Tests, Standard Setting (Scoring)

Using the Many-Faceted Rasch Model to Evaluate Standard Setting Judgments: An Illustration with the Advanced Placement Environmental Science Exam

Peer reviewed

Direct link

Kaliski, Pamela K.; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna L.; Plake, Barbara S.; Reshetar, Rosemary A. – Educational and Psychological Measurement, 2013

The many-faceted Rasch (MFR) model has been used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR model for examining the quality of ratings obtained from a standard…

Descriptors: Item Response Theory, Models, Standard Setting (Scoring), Science Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3

Plake, Barbara S.	3
Wyse, Adam E.	3
Chang, Lei	2
Melican, Gerald J.	2
Reid, Jerry B.	2
Wright, Benjamin D.	2
Arce, Alvaro J.	1
Aziz, Azrilah Abdul	1
Babcock, Ben	1
Beguin, Anton	1
Bramley, Tom	1
Buckendahl, Chad W.	1
Cetin, Sevda	1
Chis, Liliana	1
Clauser, Brian E.	1
Darling, Jonathan	1
Davis-Becker, Susan L.	1
DeMauro, Gerald E.	1
Engelhard, George, Jr.	1
Faggen, Jane	1
Fitzpatrick, Anne R.	1
Garrido, Mariquita	1
Gelbal, Selahattin	1
Gerrow, Jack	1
More ▼