Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 11 |
Descriptor
| Educational Testing | 11 |
| Test Reliability | 8 |
| Test Construction | 5 |
| Test Validity | 5 |
| Psychological Testing | 4 |
| Student Evaluation | 4 |
| Culture Fair Tests | 3 |
| Evaluation Methods | 3 |
| Interrater Reliability | 3 |
| Scoring | 3 |
| Standards | 3 |
| More ▼ | |
Source
| Educational Measurement:… | 2 |
| ASCD | 1 |
| Applied Measurement in… | 1 |
| Cogent Education | 1 |
| Educational Assessment | 1 |
| Frontline Learning Research | 1 |
| International Journal of… | 1 |
| Nebraska Department of… | 1 |
| Pearson | 1 |
| SAGE Open | 1 |
Author
| Arhin, Ato Kwamina | 1 |
| Beck, Klaus | 1 |
| DeStefano, Marissa | 1 |
| Gilby, Caitlin | 1 |
| Guangming Li | 1 |
| Hogan, Thomas | 1 |
| Jonson, Jessica L. | 1 |
| Kosman, Dana | 1 |
| Metsämuuronen, Jari | 1 |
| Mislevy, Robert J. | 1 |
| Oliveri, Maria Elena | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 8 |
| Reports - Research | 5 |
| Reports - Descriptive | 3 |
| Books | 2 |
| Guides - Classroom - Teacher | 2 |
| Information Analyses | 1 |
| Numerical/Quantitative Data | 1 |
Education Level
| Elementary Education | 2 |
| Early Childhood Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 1 | 1 |
| Grade 2 | 1 |
| Grade 3 | 1 |
| Grade 4 | 1 |
| Higher Education | 1 |
| Intermediate Grades | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Hogan, Thomas; DeStefano, Marissa; Gilby, Caitlin; Kosman, Dana; Peri, Joshua – Applied Measurement in Education, 2021
Buros' "Mental Measurements Yearbook (MMY)" has provided professional reviews of commercially published psychological and educational tests for over 80 years. It serves as a kind of conscience for the testing industry. For a random sample of 50 entries in the "19th MMY" (a total of 100 separate reviews) this study determined…
Descriptors: Test Reviews, Interrater Reliability, Psychological Testing, Educational Testing
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Mislevy, Robert J.; Oliveri, Maria Elena – Educational Measurement: Issues and Practice, 2019
In this digital ITEMS module, Dr. Robert [Bob] Mislevy and Dr. Maria Elena Oliveri introduce and illustrate a sociocognitive perspective on educational measurement, which focuses on a variety of design and implementation considerations for creating fair and valid assessments for learners from diverse populations with diverse sociocultural…
Descriptors: Educational Testing, Reliability, Test Validity, Test Reliability
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
W. James Popham – Pearson, 2024
"Classroom Assessment" shows pre- and in-service teachers how to use classroom testing accurately and formatively to dramatically increase their teaching effectiveness and promote student learning. In addition to clear and concise guidelines on how to develop and use quality classroom assessments, the author also focuses on the teaching…
Descriptors: Student Evaluation, Testing, Teacher Effectiveness, Test Construction
Guangming Li; Zhengyan Liang – SAGE Open, 2024
In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used…
Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3
Beck, Klaus – Frontline Learning Research, 2020
Many test developers try to ensure the content validity of their tests by having external experts review the items, e.g. in terms of relevance, difficulty, or clarity. Although this approach is widely accepted, a closer look reveals several pitfalls need to be avoided if experts' advice is to be truly helpful. The purpose of this paper is to…
Descriptors: Content Validity, Psychological Testing, Educational Testing, Student Evaluation
Popham, W. James – ASCD, 2018
What is assessment literacy? It is a handful of fundamental understandings about the testing concepts and procedures that influence educational decisions. And it just might be the most cost-effective means of real school improvement. With characteristic humor and aplomb, assessment expert W. James Popham strips away the psychometrician-speak and…
Descriptors: Student Evaluation, Educational Testing, Test Validity, Test Reliability
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Jonson, Jessica L.; Trantham, Pamela; Usher-Tate, Betty Jean – Educational Measurement: Issues and Practice, 2019
One of the substantive changes in the 2014 Standards for Educational and Psychological Testing was the elevation of fairness in testing as a foundational element of practice in addition to validity and reliability. Previous research indicates that testing practices often do not align with professional standards and guidelines. Therefore, to raise…
Descriptors: Culture Fair Tests, Test Validity, Test Reliability, Intelligence Tests
Quaigrain, Kennedy; Arhin, Ato Kwamina – Cogent Education, 2017
Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…
Descriptors: Item Analysis, Teacher Developed Materials, Test Reliability, Educational Assessment

Peer reviewed
Direct link
