ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	9
Since 2007 (last 20 years)	14

Descriptor

Test Items	24
Test Validity	24
Test Construction	16
Standard Setting	13
Test Reliability	12
Standard Setting (Scoring)	11
Cutting Scores	9
Scoring	9
Psychometrics	7
Difficulty Level	6
Evaluation Methods	6
Licensing Examinations…	6
State Standards	6
Academic Standards	5
English	5
Item Analysis	5
Science Tests	5
Test Results	5
Item Response Theory	4
Knowledge Level	4
Mathematics Tests	4
Middle School Students	4
State Programs	4
Summative Evaluation	4
Testing Programs	4
More ▼

Source

Nebraska Department of…	4
Journal of Educational…	2
New Mexico Public Education…	2
Practical Assessment,…	2
Alberta Journal of…	1
Applied Measurement in…	1
English Language Teaching	1
International Journal of…	1
Journal of Applied Testing…	1
Language Testing	1
Online Submission	1
More ▼

Publication Type

Reports - Research	12
Journal Articles	10
Numerical/Quantitative Data	6
Reports - Descriptive	6
Reports - Evaluative	5
Speeches/Meeting Papers	4
Guides - Classroom - Teacher	1
Tests/Questionnaires	1

Education Level

Secondary Education	5
Junior High Schools	4
Middle Schools	4
Elementary Education	3
Elementary Secondary Education	3
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Higher Education	1
Intermediate Grades	1
Postsecondary Education	1
More ▼

Audience

Practitioners

Location

Nebraska	4
Tennessee	4
New Mexico	2
Europe	1
Massachusetts	1
Nigeria	1
Thailand	1

Laws, Policies, & Programs

Comprehensive Education…

Assessments and Surveys

National Teacher Examinations	2
Massachusetts Comprehensive…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Examining the Impact of a Consensus Approach to Content Alignment Studies

Peer reviewed
PDF on ERIC

Download full text

Russell, Michael; Moncaleano, Sebastian – Practical Assessment, Research & Evaluation, 2020

Although both content alignment and standard-setting procedures rely on content-expert panel judgements, only the latter employs discussion among panel members. This study employed a modified form of the Webb methodology to examine content alignment for twelve tests administered as part of the Massachusetts Comprehensive Assessment System (MCAS).…

Descriptors: Test Content, Test Items, Discussion, Test Validity

Exploring the Influence of Judge Proficiency on Standard-Setting Judgments

Peer reviewed

Direct link

Peabody, Michael R.; Wind, Stefanie A. – Journal of Educational Measurement, 2019

Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard-setting panels should have the proper qualifications to make the judgments asked…

Descriptors: Standard Setting, Decision Making, Performance Based Assessment, Evaluators

An Experimental Study of the Internal Consistency of Judgments Made in Bookmark Standard Setting

Peer reviewed

Direct link

Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia – Journal of Educational Measurement, 2017

Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…

Descriptors: Evaluation Methods, Standard Setting, Cutting Scores, Expertise

2023-2024 NSCAS Growth: English Language Arts, Mathematics, and Science Technical Report

Download full text

Nebraska Department of Education, 2024

The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…

Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students

The Development of STEP, the CEFR-Based English Proficiency Test

Peer reviewed
PDF on ERIC

Download full text

Sridhanyarat, Kietnawin; Pathong, Supakarn; Suranakkharin, Todsapon; Ammaralikit, Amornrat – English Language Teaching, 2021

This study aimed at developing the Silpakorn Test of English Proficiency (STEP), in alignment with the Common European Framework of Reference for Languages (CEFR), and in accordance with the theoretical framework established by Alderson et al. (2006). Four major steps were involved in the test construction. First, English language lecturers who…

Descriptors: Language Tests, Language Proficiency, Second Language Learning, Second Language Instruction

Spring 2021 NSCAS Phase I Pilot ELA, Mathematics, and Science Technical Report

Download full text

Nebraska Department of Education, 2021

This technical report documents the processes and procedures implemented to support the Spring 2021 Nebraska Student-Centered Assessment System (NSCAS) Phase I Pilot in English Language Arts (ELA), Mathematics, and Science assessments by NWEA® under the supervision of the Nebraska Department of Education (NDE). The technical report shows how the…

Descriptors: Psychometrics, Standard Setting, English, Language Arts

Getting Lucky: How Guessing Threatens the Validity of Performance Classifications

Peer reviewed
PDF on ERIC

Download full text

Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016

There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…

Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction

Spring 2020 NSCAS General Summative ELA, Mathematics, and Science Technical Report

Download full text

Nebraska Department of Education, 2020

The Spring 2020 Nebraska Student-Centered Assessment System (NSCAS) General Summative testing was cancelled due to COVID-19. This technical report documents the processes and procedures that had been implemented to support the Spring 2020 assessments prior to the cancellation. The following sections are presented in this technical report: (1)…

Descriptors: English, Language Arts, Mathematics Tests, Science Tests

Spring 2019 NSCAS Summative ELA, Mathematics, and Science Technical Report

Download full text

Nebraska Department of Education, 2019

This technical report documents the processes and procedures implemented to support the Spring 2019 Nebraska Student-Centered Assessment System (NSCAS) General Summative English Language Arts (ELA), Mathematics, and Science assessments by NWEA® under the supervision of the Nebraska Department of Education (NDE). The technical report shows how the…

Descriptors: English, Language Arts, Summative Evaluation, Mathematics Tests

Comparing Panelists' Understanding of Standard Setting across Multiple Levels of an Alternate Science Assessment

Peer reviewed

Direct link

Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi – Applied Measurement in Education, 2013

Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…

Descriptors: Alternative Assessment, Science Tests, Standard Setting, Test Validity

Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi – International Journal of Evaluation and Research in Education, 2016

High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

Descriptors: Item Response Theory, Test Items, Difficulty Level, Statistical Analysis

Combining the Best of Two Standard Setting Methods: The Ordered Item Booklet Angoff

Peer reviewed

Direct link

Smith, Russell W.; Davis-Becker, Susan L.; O'Leary, Lisa S. – Journal of Applied Testing Technology, 2014

This article describes a hybrid standard setting method that combines characteristics of the Angoff (1971) and Bookmark (Mitzel, Lewis, Patz & Green, 2001) methods. The proposed approach utilizes strengths of each method while addressing weaknesses. An ordered item booklet, with items sorted based on item difficulty, is used in combination…

Descriptors: Standard Setting, Difficulty Level, Test Items, Rating Scales

The Bookmark Procedure for Setting Cut-Scores and Finalizing Performance Standards: Strengths and Weaknesses

Peer reviewed

Direct link

Lin, Jie – Alberta Journal of Educational Research, 2006

The Bookmark standard-setting procedure was developed to address the perceived problems with the most popular method for setting cut-scores: the Angoff procedure (Angoff, 1971). The purposes of this article are to review the Bookmark procedure and evaluate it in terms of Berk's (1986) criteria for evaluating cut-score setting methods. The…

Descriptors: Standard Setting (Scoring), Cutting Scores, Evaluation Criteria, Evaluation Research

A Critique of Difficulty Estimation Methodologies in the Setting of Cut Points and a Discussion of an Alternative Methodology: The Direct Standard Setting Method.

Schoon, Craig G.; And Others – 1988

The determination of appropriate cut scores is a critical step in the development of licensing and certification examinations. Passing point methodologies based on the estimation of item difficulties are underlain by the estimation of the probability of a correct response to items by a hypothetically minimally competent candidate. The Angoff…

Descriptors: Cutting Scores, Difficulty Level, Estimation (Mathematics), Item Analysis

Previous Page | Next Page »

Pages: 1 | 2

Bowman, Harry L.	2
Ammaralikit, Amornrat	1
Baldwin, Peter	1
Bello, Samira Abdullahi	1
Bichi, Ado Abdu	1
Butler, E. Dean	1
Clauser, Brian E.	1
Davis-Becker, Susan L.	1
Eignor, Daniel R.	1
Fitzpatrick, Steven J.	1
Foley, Brett P.	1
Griph, Gerald W.	1
Hafiz, Hadiza	1
Hambleton, Ronald K.	1
Hansen, Mary A.	1
Heeyeon Yoon	1
Heh, Peter	1
Lin, Jie	1
Lyon, Steven R.	1
Margolis, Melissa J.	1
McCowan, Richard J.	1
McCowan, Sheila C.	1
Mee, Janet	1
Moncaleano, Sebastian	1
More ▼