Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 5 |
| Since 2007 (last 20 years) | 20 |
Descriptor
| Difficulty Level | 42 |
| Standard Setting (Scoring) | 42 |
| Test Items | 33 |
| Cutting Scores | 27 |
| Interrater Reliability | 11 |
| Standards | 10 |
| Item Response Theory | 9 |
| Licensing Examinations… | 9 |
| Minimum Competency Testing | 9 |
| Evaluators | 7 |
| Higher Education | 7 |
| More ▼ | |
Source
Author
| Plake, Barbara S. | 3 |
| Wyse, Adam E. | 3 |
| Chang, Lei | 2 |
| Melican, Gerald J. | 2 |
| Reid, Jerry B. | 2 |
| Wright, Benjamin D. | 2 |
| Arce, Alvaro J. | 1 |
| Aziz, Azrilah Abdul | 1 |
| Babcock, Ben | 1 |
| Beguin, Anton | 1 |
| Bramley, Tom | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 32 |
| Journal Articles | 25 |
| Speeches/Meeting Papers | 11 |
| Reports - Evaluative | 9 |
| Dissertations/Theses -… | 1 |
Education Level
| Higher Education | 5 |
| Postsecondary Education | 5 |
| Secondary Education | 5 |
| Grade 3 | 2 |
| Grade 5 | 2 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 11 | 1 |
| Grade 7 | 1 |
| High Schools | 1 |
| Junior High Schools | 1 |
| More ▼ | |
Audience
| Researchers | 3 |
Location
| United Kingdom | 4 |
| New Jersey | 2 |
| California | 1 |
| Germany | 1 |
| Malaysia | 1 |
Laws, Policies, & Programs
| Education Consolidation… | 1 |
Assessments and Surveys
| Advanced Placement… | 1 |
What Works Clearinghouse Rating
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…
Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Wyse, Adam E. – Applied Measurement in Education, 2018
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)
Khatimin, Nuraini; Aziz, Azrilah Abdul; Zaharim, Azami; Yasin, Siti Hanani Mat – International Education Studies, 2013
Measurement and evaluation of students' achievement are an important aspect to make sure that students really understand the course content and monitor students' achievement level. Performance is not only reflected from the numbers of high achievers of the students, but also on quality of the grade obtained; does the grade "A" truly…
Descriptors: Standard Setting, Item Response Theory, Measurement Objectives, Measurement Techniques
Shulruf, Boaz; Jones, Phil; Turner, Rolf – Higher Education Studies, 2015
The determination of Pass/Fail decisions over Borderline grades, (i.e., grades which do not clearly distinguish between the competent and incompetent examinees) has been an ongoing challenge for academic institutions. This study utilises the Objective Borderline Method (OBM) to determine examinee ability and item difficulty, and from that…
Descriptors: Undergraduate Students, Pass Fail Grading, Decision Making, Probability
Sorensen, Henry L. – ProQuest LLC, 2013
Cut-score setting processes are used to establish the passing standards for all kinds of tests in education and for credentialing. While experts use their best efforts to guide cut-score setting processes to generate valid and reliable results, cut-score participants often have a difficult time understanding the standard at which the cut score is…
Descriptors: Cutting Scores, Standard Setting (Scoring), Comparative Analysis, Difficulty Level
O'Neill, Thomas R.; Peabody, Michael R.; Stelter, Keith L.; Hagen, Michael D. – Online Submission, 2015
(Purpose) The purpose of our study was to assess the need for an external searchable resource to be used in conjunction with the American Board of Family Medicine's (ABFM) Maintenance of Certification for Family Physicians (MC-FP) Examination, discuss the philosophical question of whether an ESR should be allowed on the examination, and outline…
Descriptors: Licensing Examinations (Professions), Family Practice (Medicine), Physicians, Online Searching
Çetin, Sevda; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2013
In this research, the cut score of a foundation university was re-calculated with bookmark method and with Angoff method, each of which is a standard setting method; and the cut scores found were compared with the current proficiency score. Thus, the final cut score was found to be 27.87 with the cooperative work of 17 experts through the Angoff…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Cutting Scores, Correlation
Stringer, Neil Simon – Research Papers in Education, 2012
General Certificate of Secondary Education (GCSE) and General Certificate of Education (GCE) grading standards are determined by Awarding Bodies using procedures that adhere to the Code of Practice published by the regulator, Ofqual. Grade boundary marks (cut scores) are set using subject experts' (senior examiners) judgement of the quality of…
Descriptors: Foreign Countries, Secondary Education, Exit Examinations, Grading
Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012
The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…
Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling
Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011
Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…
Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)
Tiffin-Richards, Simon P.; Pant, Hans Anand; Koller, Olaf – Educational Measurement: Issues and Practice, 2013
Cut-scores were set by expert judges on assessments of reading and listening comprehension of English as a foreign language (EFL), using the bookmark standard-setting method to differentiate proficiency levels defined by the Common European Framework of Reference (CEFR). Assessments contained stratified item samples drawn from extensive item…
Descriptors: Foreign Countries, English (Second Language), Language Tests, Standard Setting (Scoring)
Kaliski, Pamela K.; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna L.; Plake, Barbara S.; Reshetar, Rosemary A. – Educational and Psychological Measurement, 2013
The many-faceted Rasch (MFR) model has been used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR model for examining the quality of ratings obtained from a standard…
Descriptors: Item Response Theory, Models, Standard Setting (Scoring), Science Tests

Peer reviewed
Direct link
