Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 6 |
| Since 2017 (last 10 years) | 10 |
| Since 2007 (last 20 years) | 36 |
Descriptor
| Evaluation Methods | 78 |
| Tests | 78 |
| Test Validity | 53 |
| Test Reliability | 31 |
| Validity | 16 |
| Measurement Techniques | 15 |
| Student Evaluation | 15 |
| Test Construction | 14 |
| Scores | 13 |
| Foreign Countries | 12 |
| Test Interpretation | 12 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 10 |
| Postsecondary Education | 7 |
| Elementary Secondary Education | 6 |
| Elementary Education | 3 |
| Middle Schools | 1 |
Audience
| Practitioners | 6 |
| Teachers | 4 |
| Administrators | 3 |
| Researchers | 1 |
Location
| United Kingdom | 4 |
| China | 2 |
| Australia | 1 |
| California | 1 |
| Germany | 1 |
| Kentucky | 1 |
| Massachusetts | 1 |
| Michigan | 1 |
| South Korea | 1 |
| Sweden | 1 |
| Taiwan | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sümeyye Arkan; Sema Tan – International Journal of Assessment Tools in Education, 2025
Teachers' perceptions, attitudes, and opinions about students, curricula, or evaluation methods contribute to the development of students' talents. Thus, researchers often collect data from teachers to identify gifted students, determine educational practices to meet the students' needs and assess gifted education programs. Researchers often…
Descriptors: Talent Identification, Academically Gifted, Evaluation Methods, Measurement Techniques
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Davis, Kirsten A.; Jesiek, Brent K.; Knight, David B. – Journal of Engineering Education, 2023
Background: Engineers operate in an increasingly global environment, making it important that engineering students develop global engineering competency to prepare them for success in the workplace. To understand this learning, we need assessment approaches that go beyond traditional self-report surveys. A previous study (Jesiek et al.,…
Descriptors: Vignettes, Engineering Education, Study Abroad, Foreign Countries
Barquero-Ruiz, Carmen; Arias-Estero, José Luis; Kirk, David – European Physical Education Review, 2020
The assessment of tactics is a subject of great interest in physical education and sport pedagogy. However, the lack of knowledge of the topic and the variety of assessment instruments make the assessment of tactics difficult. This study aimed to describe assessment in relation to tactical learning outcomes through an analysis of assessment…
Descriptors: Teaching Methods, Educational Games, Physical Education, Evaluation Methods
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Carlisle, Sylvia – PRIMUS, 2020
Specifications grading is a version of mastery grading distinguished by giving students clear specifications that their work must meet, and grading most things pass/fail based on those specifications. Mastery grading systems can get quite elaborate, with hierarchies of objectives and various systems for rewriting and retesting. In this article I…
Descriptors: Grading, Standards, Mathematics Instruction, Calculus
Lee, Keng-Lin; Tsai, Shih-Li; Chiu, Yu-Ting; Ho, Ming-Jung – Advances in Health Sciences Education, 2016
Measurement invariance is a prerequisite for comparing measurement scores from different groups. In medical education, multi-source feedback (MSF) is utilized to assess core competencies, including the professionalism. However, little attention has been paid to the measurement invariance of assessment instruments; that is, whether an instrument…
Descriptors: Measurement, Scores, Medical Education, Competence
Pennell, Adam – ProQuest LLC, 2019
This dissertation consists of three studies which examined multidimensional balance in youth (= 21 years; Individuals with Disabilities Education Act, 2004) with visual impairments (VIs) using the Brief-Balance Evaluation Systems Test (Brief-BESTest). These studies have the potential to inform (adapted) physical education curricula and…
Descriptors: Psychomotor Skills, Youth, Visual Impairments, Human Posture
Phelps, Richard P. – Pioneer Institute for Public Policy Research, 2016
The Thomas B. Fordham Institute has released a report, "Evaluating the Content and Quality of Next Generation Assessments," ostensibly an evaluative comparison of four testing programs, the Common Core derived SBAC and PARCC, ACT's Aspire, and the Commonwealth of Massachusetts' MCAS. Of course, anyone familiar with Fordham's past work…
Descriptors: Evaluation Methods, Tests, Evaluation Research, Standardized Tests
Wedman, Jonathan; Lyrén, Per-Erik – Practical Assessment, Research & Evaluation, 2015
When subscores on a test are reported to the test taker, the appropriateness of reporting them depends on whether they provide useful information above what is provided by the total score. Subscores that fail to do so lack adequate psychometric quality and should not be reported. There are several methods for examining the quality of subscores,…
Descriptors: Evaluation Methods, Psychometrics, Scores, Tests
Spencer, Kevin Wayne; O'Rourke, Susan; Kelley, Frances – AERA Online Paper Repository, 2017
The development and initial psychometric investigation of the Hocus Focus Analytics (HFA) scale, an instrument to measure student growth and outcomes using an arts-integrated teaching approach, is reported. A 15-item measure consisting of five subscales (cognitive, motor, communication, social skills and creativity) was developed to measure…
Descriptors: Art Activities, Teaching Methods, Psychometrics, Evaluation Methods
Crisp, Victoria; Shaw, Stuart – Educational Studies, 2012
Validity is a central principle of assessment relating to the appropriateness of the uses and interpretations of test results. Usually, one of the inferences that we wish to make is that the score reflects the extent of a student's learning in a given domain. Thus, it is important to establish that the assessment tasks elicit performances that…
Descriptors: Test Results, Evaluation Methods, Construct Validity, Validity
Lee, Hyunjeong – Educational Psychology, 2014
This study investigated a reliable and valid method for measuring cognitive load during learning through comparing various types of cognitive load measurements: electroencephalography (EEG), self-reporting, and learning outcome. A total of 43 college-level students underwent watching a documentary delivered in English or in Korean. EEG was…
Descriptors: Cognitive Ability, Correlation, Brain Hemisphere Functions, Diagnostic Tests

Peer reviewed
Direct link
