Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 5 |
| Since 2007 (last 20 years) | 10 |
Descriptor
Source
Author
| Lockwood, Robert E. | 2 |
| McLean, James E. | 2 |
| Babcock, Ben | 1 |
| Bardenhagen, Fiona J. | 1 |
| Bowden, Stephen C. | 1 |
| Bramley, Tom | 1 |
| Busch, John Christian | 1 |
| Cetin, Sevda | 1 |
| Chamberlain, Suzanne | 1 |
| Cook, Mark J. | 1 |
| Cope, Ronald T. | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 21 |
| Journal Articles | 12 |
| Speeches/Meeting Papers | 8 |
Education Level
| Secondary Education | 2 |
| Elementary Education | 1 |
| Grade 6 | 1 |
| Higher Education | 1 |
| Intermediate Grades | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
Audience
| Researchers | 5 |
Laws, Policies, & Programs
Assessments and Surveys
| Alabama High School… | 3 |
| Test of English as a Foreign… | 1 |
| Test of English for… | 1 |
| Wechsler Adult Intelligence… | 1 |
What Works Clearinghouse Rating
Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…
Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…
Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018
The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…
Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
"Maybe I'm Not as Good as I Think I Am." How Qualification Users Interpret Their Examination Results
Chamberlain, Suzanne – Educational Research, 2012
Background: Assessment grades are "estimates" of ability or performance and there are many reasons why an awarded grade might not meet a candidate's expectations, being either better or poorer than anticipated. Although there may be some obvious reasons for grade discrepancies, such as a lack of preparation or under-performance, there…
Descriptors: Foreign Countries, Outcome Measures, Evaluation Criteria, Scores
Hsieh, Mingchuan – Language Assessment Quarterly, 2013
The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests
Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008
The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…
Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores
Bowden, Stephen C.; Weiss, Lawrence G.; Holdnack, James A.; Bardenhagen, Fiona J.; Cook, Mark J. – Assessment, 2008
A psychological measurement model provides an explicit definition of (a) the theoretical and (b) the numerical relationships between observed scores and the latent variables that underlie the observed scores. Examination of the metric invariance of a measurement model involves testing the hypothesis that all components of the model relating…
Descriptors: Measurement Techniques, Foreign Countries, Cognitive Ability, Scores
Kane, Michael; Wilson, Jennifer – 1982
This paper evaluates the magnitude of the total error in estimates of the difference between an examinee's domain score and the cutoff score. An observed score based on a random sample of items from the domain, and an estimated cutoff score derived from a judgmental standard setting procedure are assumed. The work of Brennan and Lockwood (1980) is…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Mastery Tests
PDF pending restorationMcLean, James E.; Lockwood, Robert E. – 1983
The sources of variability in the Angoff standard-setting procedure, when applied to the Alabama High School Graduation Examination (AHSGE), were examined. The sources of variability examined are judges, rounds (replications), competencies (items), and interactions among these three sources. After training, the judges were given a statement of a…
Descriptors: Academic Standards, Cutting Scores, Error of Measurement, Factor Analysis
deGruijter, Dato N. M. – 1980
The setting of standards involves subjective value judgments. The inherent arbitrariness of specific standards has been severely criticized by Glass. His antagonists agree that standard setting is a judgmental task but they have pointed out that arbitrariness in the positive sense of serious judgmental decisions is unavoidable. Further, small…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests
Peer reviewedMeskauskas, John A. – Evaluation and the Health Professions, 1986
Two new indices of stability of content-referenced standard-setting results are presented, relating variability of judges' decisions to the variability of candidate scores and to the reliability of the test. These indices are used to indicate whether scores resulting from a standard-setting study are of sufficient precision. (Author/LMO)
Descriptors: Certification, Credentials, Error of Measurement, Generalizability Theory
Ziomek, Robert L.; Szymczuk, Mike – 1983
In order to evaluate standard setting procedures, apart from the more commonly applied approach of simply comparing the derived standards or failure rates across various techniques, this study investigated the errors of classification associated with the contrasting groups procedures. Monte Carlo simulations were employed to produce…
Descriptors: Classification, Computer Simulation, Error of Measurement, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1 | 2
Direct link
