Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 9 |
| Since 2007 (last 20 years) | 14 |
Descriptor
| Test Items | 24 |
| Test Validity | 24 |
| Test Construction | 16 |
| Standard Setting | 13 |
| Test Reliability | 12 |
| Standard Setting (Scoring) | 11 |
| Cutting Scores | 9 |
| Scoring | 9 |
| Psychometrics | 7 |
| Difficulty Level | 6 |
| Evaluation Methods | 6 |
| More ▼ | |
Source
Author
Publication Type
| Reports - Research | 12 |
| Journal Articles | 10 |
| Numerical/Quantitative Data | 6 |
| Reports - Descriptive | 6 |
| Reports - Evaluative | 5 |
| Speeches/Meeting Papers | 4 |
| Guides - Classroom - Teacher | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Secondary Education | 5 |
| Junior High Schools | 4 |
| Middle Schools | 4 |
| Elementary Education | 3 |
| Elementary Secondary Education | 3 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| Grade 8 | 1 |
| Higher Education | 1 |
| Intermediate Grades | 1 |
| More ▼ | |
Audience
| Practitioners | 1 |
Laws, Policies, & Programs
| Comprehensive Education… | 2 |
Assessments and Surveys
| National Teacher Examinations | 2 |
| Massachusetts Comprehensive… | 1 |
What Works Clearinghouse Rating
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Russell, Michael; Moncaleano, Sebastian – Practical Assessment, Research & Evaluation, 2020
Although both content alignment and standard-setting procedures rely on content-expert panel judgements, only the latter employs discussion among panel members. This study employed a modified form of the Webb methodology to examine content alignment for twelve tests administered as part of the Massachusetts Comprehensive Assessment System (MCAS).…
Descriptors: Test Content, Test Items, Discussion, Test Validity
Peabody, Michael R.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard-setting panels should have the proper qualifications to make the judgments asked…
Descriptors: Standard Setting, Decision Making, Performance Based Assessment, Evaluators
Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia – Journal of Educational Measurement, 2017
Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…
Descriptors: Evaluation Methods, Standard Setting, Cutting Scores, Expertise
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Sridhanyarat, Kietnawin; Pathong, Supakarn; Suranakkharin, Todsapon; Ammaralikit, Amornrat – English Language Teaching, 2021
This study aimed at developing the Silpakorn Test of English Proficiency (STEP), in alignment with the Common European Framework of Reference for Languages (CEFR), and in accordance with the theoretical framework established by Alderson et al. (2006). Four major steps were involved in the test construction. First, English language lecturers who…
Descriptors: Language Tests, Language Proficiency, Second Language Learning, Second Language Instruction
Nebraska Department of Education, 2021
This technical report documents the processes and procedures implemented to support the Spring 2021 Nebraska Student-Centered Assessment System (NSCAS) Phase I Pilot in English Language Arts (ELA), Mathematics, and Science assessments by NWEA® under the supervision of the Nebraska Department of Education (NDE). The technical report shows how the…
Descriptors: Psychometrics, Standard Setting, English, Language Arts
Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016
There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…
Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction
Nebraska Department of Education, 2020
The Spring 2020 Nebraska Student-Centered Assessment System (NSCAS) General Summative testing was cancelled due to COVID-19. This technical report documents the processes and procedures that had been implemented to support the Spring 2020 assessments prior to the cancellation. The following sections are presented in this technical report: (1)…
Descriptors: English, Language Arts, Mathematics Tests, Science Tests
Nebraska Department of Education, 2019
This technical report documents the processes and procedures implemented to support the Spring 2019 Nebraska Student-Centered Assessment System (NSCAS) General Summative English Language Arts (ELA), Mathematics, and Science assessments by NWEA® under the supervision of the Nebraska Department of Education (NDE). The technical report shows how the…
Descriptors: English, Language Arts, Summative Evaluation, Mathematics Tests
Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi – Applied Measurement in Education, 2013
Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…
Descriptors: Alternative Assessment, Science Tests, Standard Setting, Test Validity
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi – International Journal of Evaluation and Research in Education, 2016
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Descriptors: Item Response Theory, Test Items, Difficulty Level, Statistical Analysis
Smith, Russell W.; Davis-Becker, Susan L.; O'Leary, Lisa S. – Journal of Applied Testing Technology, 2014
This article describes a hybrid standard setting method that combines characteristics of the Angoff (1971) and Bookmark (Mitzel, Lewis, Patz & Green, 2001) methods. The proposed approach utilizes strengths of each method while addressing weaknesses. An ordered item booklet, with items sorted based on item difficulty, is used in combination…
Descriptors: Standard Setting, Difficulty Level, Test Items, Rating Scales
Lin, Jie – Alberta Journal of Educational Research, 2006
The Bookmark standard-setting procedure was developed to address the perceived problems with the most popular method for setting cut-scores: the Angoff procedure (Angoff, 1971). The purposes of this article are to review the Bookmark procedure and evaluate it in terms of Berk's (1986) criteria for evaluating cut-score setting methods. The…
Descriptors: Standard Setting (Scoring), Cutting Scores, Evaluation Criteria, Evaluation Research
Schoon, Craig G.; And Others – 1988
The determination of appropriate cut scores is a critical step in the development of licensing and certification examinations. Passing point methodologies based on the estimation of item difficulties are underlain by the estimation of the probability of a correct response to items by a hypothetically minimally competent candidate. The Angoff…
Descriptors: Cutting Scores, Difficulty Level, Estimation (Mathematics), Item Analysis
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
