Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 7 |
Descriptor
| Evaluation Methods | 17 |
| Interrater Reliability | 17 |
| Test Items | 17 |
| Scoring | 8 |
| Foreign Countries | 5 |
| Item Response Theory | 5 |
| Psychometrics | 4 |
| Scores | 4 |
| Test Reliability | 4 |
| Measures (Individuals) | 3 |
| Reading Tests | 3 |
| More ▼ | |
Source
Author
| Friedman, Greg | 2 |
| McGinty, Dixie | 2 |
| Michaels, Hillary | 2 |
| Neel, John H. | 2 |
| Ochieng, Charles | 2 |
| Yen, Shu Jing | 2 |
| Avery, Marybell | 1 |
| Boccaccini, Marcus T. | 1 |
| Chen, Ching-I | 1 |
| Clifford, Jantina R. | 1 |
| Desstya, Anatri | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 10 |
| Speeches/Meeting Papers | 9 |
| Journal Articles | 8 |
| Reports - Evaluative | 4 |
| Guides - General | 1 |
| Information Analyses | 1 |
| Reports - Descriptive | 1 |
Education Level
| Elementary Education | 3 |
| Grade 4 | 3 |
| Grade 6 | 2 |
| Grade 8 | 2 |
| Elementary Secondary Education | 1 |
| Higher Education | 1 |
| Intermediate Grades | 1 |
Audience
| Researchers | 1 |
Location
| Canada | 1 |
| India | 1 |
| Indonesia | 1 |
| Netherlands | 1 |
| Oregon | 1 |
| United States | 1 |
| Washington | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
What Works Clearinghouse Rating
Koçak, Duygu – International Electronic Journal of Elementary Education, 2020
One of the most commonly used methods for measuring higher-order thinking skills such as problem-solving or written expression is open-ended items. Three main approaches are used to evaluate responses to open-ended items: general evaluation, rating scales, and rubrics. In order to measure and improve problem-solving skills of students, firstly, an…
Descriptors: Interrater Reliability, Item Response Theory, Test Items, Rating Scales
Desstya, Anatri; Prasetyo, Zuhdan Kun; Suyanta; Susila, Ihwan; Irwanto – International Journal of Instruction, 2019
This study aims to report the development an instrument that is standardized (reviewed by validity, reliability, and difficulty index) to detect science misconception in an elementary school teacher. This study used a 4-D model; defining, designing, developing, and disseminating. First, it was prepared with 47 opened-ended questions, and then it…
Descriptors: Elementary School Teachers, Misconceptions, Evaluation Methods, Teacher Evaluation
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Rufino, Katrina A.; Boccaccini, Marcus T.; Guy, Laura S. – Assessment, 2011
Although reliability is essential to validity, most research on violence risk assessment tools has paid little attention to strategies for improving rater agreement. The authors evaluated the degree to which perceived subjectivity in scoring guidelines for items from two measures--the Psychopathy Checklist-Revised (PCL-R) and the Historical,…
Descriptors: Risk Management, Predictive Validity, Interrater Reliability, Scoring
Sood, Vishal – Journal on Educational Psychology, 2013
For identifying children with four major kinds of verbal learning disabilities viz. reading disability, speech and language comprehension disability, writing disability and mathematics disability, the present task was undertaken to construct and standardize verbal learning disabilities checklist. This checklist was developed by keeping in view the…
Descriptors: Verbal Learning, Learning Disabilities, Children, Disability Identification
Squires, Jane K.; Waddell, Misti L.; Clifford, Jantina R.; Funk, Kristin; Hoselton, Robert M.; Chen, Ching-I – Topics in Early Childhood Special Education, 2013
Psychometric and utility studies on Social Emotional Assessment Measure (SEAM), an innovative tool for assessing and monitoring social-emotional and behavioral development in infants and toddlers with disabilities, were conducted. The Infant and Toddler SEAM intervals were the study focus, using mixed methods, including item response theory…
Descriptors: Psychometrics, Evaluation Methods, Social Development, Emotional Development
Zhu, Weimo; Rink, Judy; Placek, Judith H.; Graber, Kim C.; Fox, Connie; Fisette, Jennifer L.; Dyson, Ben; Park, Youngsik; Avery, Marybell; Franck, Marian; Raynes, De – Measurement in Physical Education and Exercise Science, 2011
New testing theories, concepts, and psychometric methods (e.g., item response theory, test equating, and item bank) developed during the past several decades have many advantages over previous theories and methods. In spite of their introduction to the field, they have not been fully accepted by physical educators. Further, the manner in which…
Descriptors: Physical Education, Quality Control, Psychometrics, Item Response Theory
Micceri, Theodore; And Others – 1987
Several issues relating to agreement estimates for different types of data from performance evaluations are considered. New indices of agreement are presented for ordinal level items and for summative scores produced by nominal or ordinal level items. Two sets of empirical data illustrate the performance of the two formulas derived to estimate…
Descriptors: Correlation, Data Analysis, Educational Research, Estimation (Mathematics)
Yen, Shu Jing; Ochieng, Charles; Michaels, Hillary; Friedman, Greg – Online Submission, 2005
The main purpose of this study was to illustrate a polytomous IRT-based linking procedure that adjusts for rater variations. Test scores from two administrations of a statewide reading assessment were used. An anchor set of Year 1 students' constructed responses were rescored by Year 2 raters. To adjust for year-to-year rater variation in IRT…
Descriptors: Test Items, Measures (Individuals), Grade 8, Item Response Theory
McGinty, Dixie; Neel, John H. – 1996
A new standard setting approach is introduced, called the cognitive components approach. Like the Angoff method, the cognitive components method generates minimum pass levels (MPLs) for each item. In both approaches, the item MPLs are summed for each judge, then averaged across judges to yield the standard. In the cognitive components approach,…
Descriptors: Cognitive Processes, Criterion Referenced Tests, Evaluation Methods, Grade 3
Takala, Sauli – 1998
This paper discusses recent developments in language testing. It begins with a review of the traditional criteria that are applied to all measurement and outlines recent emphases that derive from the expanding range of stakeholders. Drawing on Alderson's seminal work, criteria are presented for evaluating communicative language tests. Developments…
Descriptors: Alternative Assessment, Communicative Competence (Languages), Comparative Analysis, Evaluation Criteria
McGinty, Dixie; Neel, John H.; Hsu, Yu-Sheng – 1996
The cognitive components standard setting method, recently introduced by D. McGinty and J. Neel (1996), asks judges to specify minimum levels of performance not for the test items, but for smaller portions of items, the component skills and concepts required to answer each item correctly. Items are decomposed into these components before judges…
Descriptors: Cognitive Processes, Criterion Referenced Tests, Elementary Education, Evaluation Methods
Yen, Shu Jing; Ochieng, Charles; Michaels, Hillary; Friedman, Greg – Online Submission, 2005
Year-to-year rater variation may result in constructed response (CR) parameter changes, making CR items inappropriate to use in anchor sets for linking or equating. This study demonstrates how rater severity affected the writing and reading scores. Rater adjustments were made to statewide results using an item response theory (IRT) methodology…
Descriptors: Test Items, Writing Tests, Reading Tests, Measures (Individuals)
Reckase, Mark D.; And Others – 1995
The research reported in this paper was conducted to gain information to guide the selection of standard setting procedures for use with polytomous items to set achievement levels on the National Assessment of Educational Progress (NAEP) assessments in U.S. History and geography. Standard-setting procedures were evaluated to determine the relative…
Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Evaluation Methods
McGhee, Debbie E.; Lowell, Nana – New Directions for Teaching and Learning, 2003
This study compares mean ratings, inter-rater reliabilities, and the factor structure of items for online and paper student-rating forms from the University of Washington's Instructional Assessment System. (Contains 3 figures and 2 tables.)
Descriptors: Psychometrics, Factor Structure, Student Evaluation of Teacher Performance, Test Items
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
