Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 9 |
Descriptor
| Error of Measurement | 11 |
| Standard Setting | 11 |
| Cutting Scores | 8 |
| Test Items | 6 |
| Item Response Theory | 5 |
| Computation | 3 |
| Generalizability Theory | 3 |
| Goodness of Fit | 3 |
| Psychometrics | 3 |
| Test Reliability | 3 |
| Academic Achievement | 2 |
| More ▼ | |
Source
Author
| Chamberlain, Suzanne | 1 |
| Clauser, Brian E. | 1 |
| Clauser, Jerome C. | 1 |
| Daniel McNeish | 1 |
| Griph, Gerald W. | 1 |
| Kane, Michael | 1 |
| Kolstad, Andrew | 1 |
| Lee, Guemin | 1 |
| Lewis, Daniel M. | 1 |
| Melissa G. Wolf | 1 |
| Munyofu, Paul | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Research | 4 |
| Reports - Descriptive | 3 |
| Reports - Evaluative | 3 |
| Numerical/Quantitative Data | 2 |
| Opinion Papers | 1 |
Education Level
| Elementary Secondary Education | 2 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| California Learning… | 1 |
| National Assessment of… | 1 |
| Test of English as a Foreign… | 1 |
| Test of English for… | 1 |
What Works Clearinghouse Rating
Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…
Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement
Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020
An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…
Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting
Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019
The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…
Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation
Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018
The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…
Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests
"Maybe I'm Not as Good as I Think I Am." How Qualification Users Interpret Their Examination Results
Chamberlain, Suzanne – Educational Research, 2012
Background: Assessment grades are "estimates" of ability or performance and there are many reasons why an awarded grade might not meet a candidate's expectations, being either better or poorer than anticipated. Although there may be some obvious reasons for grade discrepancies, such as a lack of preparation or under-performance, there…
Descriptors: Foreign Countries, Outcome Measures, Evaluation Criteria, Scores
Munyofu, Paul – Performance Improvement Quarterly, 2010
The state of Pennsylvania, like many organizations interested in performance improvement, routinely engages in professional development activities. Educators in this hands-on activity engaged in setting meaningful criterion-referenced cut scores for career and technical education assessments using two methods. The main purposes of this study were…
Descriptors: Standard Setting, Cutting Scores, Professional Development, Vocational Education
Yin, Ping; Sconing, James – Educational and Psychological Measurement, 2008
Standard-setting methods are widely used to determine cut scores on a test that examinees must meet for a certain performance standard. Because standard setting is a measurement procedure, it is important to evaluate variability of cut scores resulting from the standard-setting process. Generalizability theory is used in this study to estimate…
Descriptors: Generalizability Theory, Standard Setting, Cutting Scores, Test Items
Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008
The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…
Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores
Reckase, Mark D. – Educational Measurement: Issues and Practice, 2006
Schulz (2006) provides a different perspective on standard setting than that provided in Reckase (2006). He also suggests a modification to the bookmark procedure and some alternative models for errors in panelists' judgments than those provided by Reckase. This article provides a response to some of the points made by Schulz and reports some…
Descriptors: Evaluation Methods, Standard Setting, Reader Response, Regression (Statistics)
New Mexico Public Education Department, 2007
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
Griph, Gerald W. – New Mexico Public Education Department, 2006
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2006 NMSBA. The 2006 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Calibration, scaling, and equating procedures; (4) Standard setting;…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring

Peer reviewed
Direct link
