Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 8 |
| Since 2007 (last 20 years) | 27 |
Descriptor
| Difficulty Level | 46 |
| Test Items | 46 |
| Standard Setting (Scoring) | 33 |
| Cutting Scores | 27 |
| Item Response Theory | 14 |
| Standard Setting | 13 |
| Licensing Examinations… | 10 |
| Test Construction | 10 |
| Item Analysis | 9 |
| Interrater Reliability | 8 |
| Standards | 8 |
| More ▼ | |
Source
Author
| Wyse, Adam E. | 3 |
| Bichi, Ado Abdu | 2 |
| Chang, Lei | 2 |
| Davis-Becker, Susan L. | 2 |
| Engelhard, George, Jr. | 2 |
| Plake, Barbara S. | 2 |
| Wind, Stefanie A. | 2 |
| Wright, Benjamin D. | 2 |
| Arce, Alvaro J. | 1 |
| Ascalon, M. Evelina | 1 |
| Babcock, Ben | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 34 |
| Journal Articles | 27 |
| Speeches/Meeting Papers | 10 |
| Reports - Evaluative | 8 |
| Dissertations/Theses -… | 2 |
| Non-Print Media | 1 |
| Reference Materials - General | 1 |
| Reports - General | 1 |
Education Level
| Higher Education | 7 |
| Postsecondary Education | 7 |
| Secondary Education | 5 |
| Grade 5 | 3 |
| Elementary Secondary Education | 2 |
| High Schools | 2 |
| Adult Education | 1 |
| Elementary Education | 1 |
| Grade 10 | 1 |
| Grade 11 | 1 |
| Grade 3 | 1 |
| More ▼ | |
Audience
| Researchers | 3 |
| Practitioners | 1 |
Laws, Policies, & Programs
| Education Consolidation… | 1 |
Assessments and Surveys
| Advanced Placement… | 2 |
What Works Clearinghouse Rating
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…
Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Bichi, Ado Abdu; Talib, Rohaya; Embong, Rahimah; Mohamed, Hasnah Binti; Ismail, Mohd Sani; Ibrahim, Abdallah – Eurasian Journal of Educational Research, 2019
Purpose: University placement test is an important admission policy priority in Nigeria, because it serves as a university-based selection criterion for placement of students into undergraduate programs in Nigeria. Although recently attention have been shifted on the call to develop a standard content and standardize the test, yet attention has…
Descriptors: Standard Setting, Economics Education, Student Placement, Cutting Scores
Wyse, Adam E. – Applied Measurement in Education, 2018
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)
Moloi, Qetelo M.; Kanjee, Anil; Roberts, Nicky – Pythagoras, 2019
Within initial teacher education there is increasing pressure to enhance the use of assessment data to support students to improve their knowledge and skills, and to determine what standards they meet upon graduation. For such data to be useful, both programme designers and students require meaningful and comprehensive assessment reports on…
Descriptors: Preservice Teacher Education, Teacher Education Programs, Standard Setting, Mathematics Tests
Shulruf, Boaz; Jones, Phil; Turner, Rolf – Higher Education Studies, 2015
The determination of Pass/Fail decisions over Borderline grades, (i.e., grades which do not clearly distinguish between the competent and incompetent examinees) has been an ongoing challenge for academic institutions. This study utilises the Objective Borderline Method (OBM) to determine examinee ability and item difficulty, and from that…
Descriptors: Undergraduate Students, Pass Fail Grading, Decision Making, Probability
Sorensen, Henry L. – ProQuest LLC, 2013
Cut-score setting processes are used to establish the passing standards for all kinds of tests in education and for credentialing. While experts use their best efforts to guide cut-score setting processes to generate valid and reliable results, cut-score participants often have a difficult time understanding the standard at which the cut score is…
Descriptors: Cutting Scores, Standard Setting (Scoring), Comparative Analysis, Difficulty Level
O'Neill, Thomas R.; Peabody, Michael R.; Stelter, Keith L.; Hagen, Michael D. – Online Submission, 2015
(Purpose) The purpose of our study was to assess the need for an external searchable resource to be used in conjunction with the American Board of Family Medicine's (ABFM) Maintenance of Certification for Family Physicians (MC-FP) Examination, discuss the philosophical question of whether an ESR should be allowed on the examination, and outline…
Descriptors: Licensing Examinations (Professions), Family Practice (Medicine), Physicians, Online Searching
Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi – Applied Measurement in Education, 2013
Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…
Descriptors: Alternative Assessment, Science Tests, Standard Setting, Test Validity
Çetin, Sevda; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2013
In this research, the cut score of a foundation university was re-calculated with bookmark method and with Angoff method, each of which is a standard setting method; and the cut scores found were compared with the current proficiency score. Thus, the final cut score was found to be 27.87 with the cooperative work of 17 experts through the Angoff…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Cutting Scores, Correlation
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi – International Journal of Evaluation and Research in Education, 2016
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Descriptors: Item Response Theory, Test Items, Difficulty Level, Statistical Analysis
Kaliski, Pamela; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna; Plake, Barbara; Reshetar, Rosemary – College Board, 2012
The Many-Facet Rasch (MFR) Model is traditionally used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR Model by examining the quality of ratings obtained from a…
Descriptors: Advanced Placement Programs, Achievement Tests, Item Response Theory, Models

Peer reviewed
Direct link
