Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 5 |
| Since 2017 (last 10 years) | 12 |
| Since 2007 (last 20 years) | 18 |
Descriptor
Source
Author
Publication Type
| Reports - Research | 36 |
| Journal Articles | 20 |
| Speeches/Meeting Papers | 9 |
| Tests/Questionnaires | 6 |
| Numerical/Quantitative Data | 1 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Elementary Secondary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Grade 12 | 1 |
| Grade 2 | 1 |
| Grade 3 | 1 |
| Grade 4 | 1 |
| Grade 5 | 1 |
| High Schools | 1 |
| More ▼ | |
Audience
| Researchers | 1 |
Location
| Tennessee | 4 |
| Kansas | 2 |
| Massachusetts | 2 |
| Nevada | 2 |
| North Carolina | 2 |
| Arizona | 1 |
| Arkansas | 1 |
| California | 1 |
| California (Los Angeles) | 1 |
| China | 1 |
| Colorado | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Comprehensive Education… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
| National Teacher Examinations | 6 |
| Test of English as a Foreign… | 3 |
| Massachusetts Comprehensive… | 1 |
| Pre Professional Skills Tests | 1 |
What Works Clearinghouse Rating
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Beheshti, Shima; Safa, Mohammad Ahmadi – Iranian Journal of Language Teaching Research, 2023
The indefinite nature of test fairness and different interpretations and definitions of the concept have stirred a lot of controversy over the years, necessitating the reconceptualization of the concept. On this basis, this study aimed to explore the empirical validity of Kunnan's (2008) Test Fairness Framework (TFF) and revisit the established…
Descriptors: Test Bias, Equal Education, Grounded Theory, Test Construction
Nicolas Rochat; Laurent Lima; Pascal Bressoux – Journal of Psychoeducational Assessment, 2025
Inference is considered an important factor in comprehension models and has been described as a causal factor in predicting comprehension. To date, specific tests for inference are rare and often rely on specific thematic texts. This reliance on thematic inference may raise some concerns as inference is related to prior text-specific knowledge.…
Descriptors: Inferences, Reading Comprehension, Reading Tests, Test Reliability
Russell, Michael; Moncaleano, Sebastian – Practical Assessment, Research & Evaluation, 2020
Although both content alignment and standard-setting procedures rely on content-expert panel judgements, only the latter employs discussion among panel members. This study employed a modified form of the Webb methodology to examine content alignment for twelve tests administered as part of the Massachusetts Comprehensive Assessment System (MCAS).…
Descriptors: Test Content, Test Items, Discussion, Test Validity
Maria Treadaway; John Read – Language Testing, 2024
Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…
Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes
Sondergeld, Toni A.; Stone, Gregory E.; Kruse, Lance M. – Educational Policy, 2020
Assessment and evaluation at all levels of educational systems have become policy priorities for many countries. Two common reasons for this are student learning expectations and accountability. Although much effort has been put into the creation and refinement of content standards, standardized tests, and methods for using testing results, there…
Descriptors: Standard Setting (Scoring), Criterion Referenced Tests, Multiple Choice Tests, Student Evaluation
Peabody, Michael R.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard-setting panels should have the proper qualifications to make the judgments asked…
Descriptors: Standard Setting, Decision Making, Performance Based Assessment, Evaluators
Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia – Journal of Educational Measurement, 2017
Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…
Descriptors: Evaluation Methods, Standard Setting, Cutting Scores, Expertise
Sridhanyarat, Kietnawin; Pathong, Supakarn; Suranakkharin, Todsapon; Ammaralikit, Amornrat – English Language Teaching, 2021
This study aimed at developing the Silpakorn Test of English Proficiency (STEP), in alignment with the Common European Framework of Reference for Languages (CEFR), and in accordance with the theoretical framework established by Alderson et al. (2006). Four major steps were involved in the test construction. First, English language lecturers who…
Descriptors: Language Tests, Language Proficiency, Second Language Learning, Second Language Instruction
Papageorgiou, Spiros; Wu, Sha; Hsieh, Ching-Ni; Tannenbaum, Richard J.; Cheng, Mengmeng – ETS Research Report Series, 2019
The past decade has seen an emerging interest in mapping (aligning or linking) test scores to language proficiency levels of external performance scales or frameworks, such as the Common European Framework of Reference (CEFR), as well as locally developed frameworks, such as China's Standards of English Language Ability (CSE). Such alignment is…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing
Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016
There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…
Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction
Akcamete, Gonul; Kayhan, Nilay; Yildirim, A. Emel Sardohan – Cypriot Journal of Educational Sciences, 2017
Professional ethics includes the principles set forth by professional associations and accepted as correct by discussions over time, and which has become the sine qua non of a profession today. Professional ethics are established to increase the quality of professional practices and ensure correct and honest conduct. Not having professional…
Descriptors: Ethics, Special Education, Special Education Teachers, Professional Associations
Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi – Applied Measurement in Education, 2013
Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…
Descriptors: Alternative Assessment, Science Tests, Standard Setting, Test Validity
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi – International Journal of Evaluation and Research in Education, 2016
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Descriptors: Item Response Theory, Test Items, Difficulty Level, Statistical Analysis

Peer reviewed
Direct link
