Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 9 |
| Since 2007 (last 20 years) | 41 |
Descriptor
| Item Analysis | 70 |
| Test Validity | 36 |
| Test Construction | 29 |
| Test Items | 26 |
| Test Reliability | 21 |
| Construct Validity | 17 |
| Validity | 16 |
| Psychometrics | 15 |
| Evaluation Methods | 11 |
| Student Evaluation | 10 |
| Reliability | 9 |
| More ▼ | |
Source
Author
| Ferrando, Pere J. | 2 |
| Ketterlin-Geller, Leanne R. | 2 |
| Liu, Kimy | 2 |
| Ahmed, S. | 1 |
| Allee-Smith, Paula J. | 1 |
| Aman, Michael G. | 1 |
| Anderson, Robin D. | 1 |
| Anderson, Trevor R. | 1 |
| Avery, Marybell | 1 |
| Baer, Ruth A. | 1 |
| Bardar, Erin M. | 1 |
| More ▼ | |
Publication Type
| Reports - Descriptive | 70 |
| Journal Articles | 49 |
| Numerical/Quantitative Data | 5 |
| Opinion Papers | 4 |
| Speeches/Meeting Papers | 4 |
| Reports - Research | 3 |
| Tests/Questionnaires | 3 |
| Guides - Non-Classroom | 1 |
Education Level
| Higher Education | 14 |
| Elementary Education | 7 |
| Elementary Secondary Education | 5 |
| Grade 4 | 5 |
| Grade 5 | 5 |
| Middle Schools | 5 |
| Grade 6 | 4 |
| Grade 7 | 4 |
| High Schools | 4 |
| Junior High Schools | 4 |
| Early Childhood Education | 3 |
| More ▼ | |
Audience
| Teachers | 5 |
| Practitioners | 3 |
| Researchers | 3 |
| Administrators | 2 |
Location
| Australia | 3 |
| Finland | 2 |
| Massachusetts | 2 |
| Netherlands | 2 |
| Tennessee | 2 |
| Belgium | 1 |
| Czech Republic | 1 |
| France | 1 |
| Georgia | 1 |
| Germany | 1 |
| Hong Kong | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
| Eysenck Personality Inventory | 1 |
| General Educational… | 1 |
| Parenting Stress Index | 1 |
| Pre Professional Skills Tests | 1 |
| SAT (College Admission Test) | 1 |
| Stanford Achievement Tests | 1 |
What Works Clearinghouse Rating
Stephen Humphry; Paul Montuoro; Carolyn Maxwell – Journal of Psychoeducational Assessment, 2024
This article builds upon a proiminent definition of construct validity that focuses on variation in attributes causing variation in measurement outcomes. This article synthesizes the defintion and uses Rasch measurement modeling to explicate a modified conceptualization of construct validity for assessments of developmental attributes. If…
Descriptors: Construct Validity, Measurement Techniques, Developmental Stages, Item Analysis
Meike Akveld; George Kinnear – International Journal of Mathematical Education in Science and Technology, 2024
Many universities use diagnostic tests to assess incoming students' preparedness for mathematics courses. Diagnostic test results can help students to identify topics where they need more practice and give lecturers a summary of strengths and weaknesses in their class. We demonstrate a process that can be used to make improvements to a mathematics…
Descriptors: Mathematics Tests, Diagnostic Tests, Test Items, Item Analysis
Becerra, Beatriz; Núñez, Paola; Vergara, Claudia; Santibáñez, David; Krüger, Dirk; Cofré, Hernán – Research in Science Education, 2023
Despite the importance of evolution to understand biology, there is significant evidence that many biology teachers have difficulties to successfully teach this topic. The purpose of this study is to describe procedures by which a paper-and-pencil instrument to assess teachers' pedagogical content knowledge for evolution (PCK[subscript evo]) was…
Descriptors: Evolution, Science Instruction, Pedagogical Content Knowledge, Construct Validity
Feranchak, Bret; Deiger, Megan – AERA Online Paper Repository, 2017
Increasingly content area projects and programs at the K-12 level, such as in mathematics, involve a programmatic component or project emphasis on developing "teacher leadership". However, there is no consistent definition or framework for this construct and even fewer validated tools for measuring it. This paper describes our efforts in…
Descriptors: Teacher Leadership, Mathematics Instruction, Guidelines, Elementary Secondary Education
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2016
The main points of Sijtsma and Green and Yang in Educational Measurement: Issues and Practice (34, 4) are that reliability, internal consistency, and unidimensionality are distinct and that Cronbach's alpha may be problematic. Neither of these assertions are at odds with Davenport, Davison, Liou, and Love in the same issue. However, many authors…
Descriptors: Educational Assessment, Reliability, Validity, Test Construction
Spurgeon, Shawn L. – Measurement and Evaluation in Counseling and Development, 2017
Construct irrelevance (CI) and construct underrepresentation (CU) are 2 major threats to validity, yet they are rarely discussed within the counseling literature. This article provides information about the relevance of these threats to internal validity. An illustrative case example will be provided to assist counselors in understanding these…
Descriptors: Construct Validity, Evaluation Criteria, Evaluation Methods, Evaluation Problems
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Ketterlin-Geller, Leanne R.; Yovanoff, Paul; Jung, EunJu; Liu, Kimy; Geller, Josh – Educational Assessment, 2013
In this article, we highlight the need for a precisely defined construct in score-based validation and discuss the contribution of cognitive theories to accurately and comprehensively defining the construct. We propose a framework for integrating cognitively based theoretical and empirical evidence to specify and evaluate the construct. We apply…
Descriptors: Test Validity, Construct Validity, Scores, Evidence
Johnson, Alyce O. – Journal of Psychoeducational Assessment, 2015
The "Parenting Stress Index, Fourth Edition" (PSI-4) is a 120-item measure used to explore parental stress levels considering a parent's relationship with one of his or her children between the ages of 1 month and 12 years. The main purpose of the test is to define these stress levels and from where they originate in order to identify…
Descriptors: Anxiety, Measures (Individuals), Parents, Child Rearing
Rogler, Dawn – English Teaching Forum, 2014
This article presents principles and practices of effective assessment, outlining seven key concepts--usefulness, reliability, validity, practicality, washback, authenticity, and transparency--and demonstrating how to apply them in creating an exam blueprint. The article also discusses the importance of providing feedback after a test has been…
Descriptors: Testing, Student Evaluation, Validity, Reliability
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Dyson, Ben; Placek, Judith H.; Graber, Kim C.; Fisette, Jennifer L.; Rink, Judy; Zhu, Weimo; Avery, Marybell; Franck, Marian; Fox, Connie; Raynes, De; Park, Youngsik – Measurement in Physical Education and Exercise Science, 2011
This article describes how assessments in PE Metrics were developed following six steps: (a) determining test blueprint, (b) writing assessment tasks and scoring rubrics, (c) establishing content validity, (d) piloting assessments, (e) conducting item analysis, and (f) modifying the assessments based on analysis and expert opinion. A task force,…
Descriptors: Expertise, Evidence, Physical Education, Elementary Education
Gormally, Cara; Brickman, Peggy; Lutz, Mary – CBE - Life Sciences Education, 2012
Life sciences faculty agree that developing scientific literacy is an integral part of undergraduate education and report that they teach these skills. However, few measures of scientific literacy are available to assess students' proficiency in using scientific literacy skills to solve scenarios in and beyond the undergraduate biology classroom.…
Descriptors: Testing, Biology, Undergraduate Study, Educational Change

Peer reviewed
Direct link
