ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	6
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	36

Descriptor

Evaluation Methods	78
Tests	78
Test Validity	53
Test Reliability	31
Validity	16
Measurement Techniques	15
Student Evaluation	15
Test Construction	14
Scores	13
Foreign Countries	12
Test Interpretation	12
Correlation	11
Comparative Analysis	10
Evaluation Criteria	10
Statistical Analysis	10
Construct Validity	9
Psychometrics	9
Test Results	9
Testing	9
Standards	8
Academic Achievement	7
Educational Assessment	7
Factor Analysis	6
Reliability	6
Test Format	6
More ▼

Publication Type

Journal Articles	46
Reports - Research	33
Guides - Non-Classroom	8
Reports - Descriptive	8
Reports - Evaluative	8
Information Analyses	4
Opinion Papers	4
Dissertations/Theses -…	3
Speeches/Meeting Papers	2
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - General	1
Reference Materials -…	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	10
Postsecondary Education	7
Elementary Secondary Education	6
Elementary Education	3
Middle Schools	1

Audience

Practitioners	6
Teachers	4
Administrators	3
Researchers	1

Location

United Kingdom	4
China	2
Australia	1
California	1
Germany	1
Kentucky	1
Massachusetts	1
Michigan	1
South Korea	1
Sweden	1
Taiwan	1
United Kingdom (England)	1
United Kingdom (Northern…	1
United Kingdom (Wales)	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Child Abuse Potential…	1
Flesch Kincaid Grade Level…	1
Fry Readability Formula	1
Massachusetts Comprehensive…	1
National Adult Literacy…	1
Peabody Picture Vocabulary…	1
SAT (College Admission Test)	1
Self Directed Search	1

What Works Clearinghouse Rating

Showing 1 to 15 of 78 results Save | Export

How Valid and Reliable Are Teachers' Assessments of Gifted Students?

Peer reviewed
PDF on ERIC

Download full text

Sümeyye Arkan; Sema Tan – International Journal of Assessment Tools in Education, 2025

Teachers' perceptions, attitudes, and opinions about students, curricula, or evaluation methods contribute to the development of students' talents. Thus, researchers often collect data from teachers to identify gifted students, determine educational practices to meet the students' needs and assess gifted education programs. Researchers often…

Descriptors: Talent Identification, Academically Gifted, Evaluation Methods, Measurement Techniques

A Note on the Use of Categorical Subscores

Peer reviewed

Direct link

Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025

Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…

Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Exploring Scenario-Based Assessment of Students' Global Engineering Competency: Building Evidence of Validity of a China-Based Situational Judgment Test

Peer reviewed

Direct link

Davis, Kirsten A.; Jesiek, Brent K.; Knight, David B. – Journal of Engineering Education, 2023

Background: Engineers operate in an increasingly global environment, making it important that engineering students develop global engineering competency to prepare them for success in the workplace. To understand this learning, we need assessment approaches that go beyond traditional self-report surveys. A previous study (Jesiek et al.,…

Descriptors: Vignettes, Engineering Education, Study Abroad, Foreign Countries

Assessment for Tactical Learning in Games: A Systematic Review

Peer reviewed

Direct link

Barquero-Ruiz, Carmen; Arias-Estero, José Luis; Kirk, David – European Physical Education Review, 2020

The assessment of tactics is a subject of great interest in physical education and sport pedagogy. However, the lack of knowledge of the topic and the variety of assessment instruments make the assessment of tactics difficult. This study aimed to describe assessment in relation to tactical learning outcomes through an analysis of assessment…

Descriptors: Teaching Methods, Educational Games, Physical Education, Evaluation Methods

Adapting Paper-Based Tests for Computer Administration: Lessons Learned from 30 Years of Mode Effects Studies in Education

Peer reviewed
PDF on ERIC

Download full text

Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022

In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…

Descriptors: Computer Assisted Testing, Tests, Scores, Scoring

Simple Specifications Grading

Peer reviewed

Direct link

Carlisle, Sylvia – PRIMUS, 2020

Specifications grading is a version of mastery grading distinguished by giving students clear specifications that their work must meet, and grading most things pass/fail based on those specifications. Mastery grading systems can get quite elaborate, with hierarchies of objectives and various systems for rewriting and retesting. In this article I…

Descriptors: Grading, Standards, Mathematics Instruction, Calculus

Can Student Self-Ratings Be Compared with Peer Ratings? A Study of Measurement Invariance of Multisource Feedback

Peer reviewed

Direct link

Lee, Keng-Lin; Tsai, Shih-Li; Chiu, Yu-Ting; Ho, Ming-Jung – Advances in Health Sciences Education, 2016

Measurement invariance is a prerequisite for comparing measurement scores from different groups. In medical education, multi-source feedback (MSF) is utilized to assess core competencies, including the professionalism. However, little attention has been paid to the measurement invariance of assessment instruments; that is, whether an instrument…

Descriptors: Measurement, Scores, Medical Education, Competence

Multidimensional Balance in Youth with Visual Impairments

Direct link

Pennell, Adam – ProQuest LLC, 2019

This dissertation consists of three studies which examined multidimensional balance in youth (= 21 years; Individuals with Disabilities Education Act, 2004) with visual impairments (VIs) using the Brief-Balance Evaluation Systems Test (Brief-BESTest). These studies have the potential to inform (adapted) physical education curricula and…

Descriptors: Psychomotor Skills, Youth, Visual Impairments, Human Posture

Fordham Institute's Pretend Research. Policy Brief

Download full text

Phelps, Richard P. – Pioneer Institute for Public Policy Research, 2016

The Thomas B. Fordham Institute has released a report, "Evaluating the Content and Quality of Next Generation Assessments," ostensibly an evaluative comparison of four testing programs, the Common Core derived SBAC and PARCC, ACT's Aspire, and the Commonwealth of Massachusetts' MCAS. Of course, anyone familiar with Fordham's past work…

Descriptors: Evaluation Methods, Tests, Evaluation Research, Standardized Tests

Methods for Examining the Psychometric Quality of Subscores: A Review and Application

Peer reviewed
PDF on ERIC

Download full text

Wedman, Jonathan; Lyrén, Per-Erik – Practical Assessment, Research & Evaluation, 2015

When subscores on a test are reported to the test taker, the appropriateness of reporting them depends on whether they provide useful information above what is provided by the total score. Subscores that fail to do so lack adequate psychometric quality and should not be reported. There are several methods for examining the quality of subscores,…

Descriptors: Evaluation Methods, Psychometrics, Scores, Tests

Development of an Arts-Integrated Assessment Instrument

Peer reviewed

Direct link

Spencer, Kevin Wayne; O'Rourke, Susan; Kelley, Frances – AERA Online Paper Repository, 2017

The development and initial psychometric investigation of the Hocus Focus Analytics (HFA) scale, an instrument to measure student growth and outcomes using an arts-integrated teaching approach, is reported. A 15-item measure consisting of five subscales (cognitive, motor, communication, social skills and creativity) was developed to measure…

Descriptors: Art Activities, Teaching Methods, Psychometrics, Evaluation Methods

Applying Methods to Evaluate Construct Validity in the Context of A Level Assessment

Peer reviewed

Direct link

Crisp, Victoria; Shaw, Stuart – Educational Studies, 2012

Validity is a central principle of assessment relating to the appropriateness of the uses and interpretations of test results. Usually, one of the inferences that we wish to make is that the score reflects the extent of a student's learning in a given domain. Thus, it is important to establish that the assessment tasks elicit performances that…

Descriptors: Test Results, Evaluation Methods, Construct Validity, Validity

Measuring Cognitive Load with Electroencephalography and Self-Report: Focus on the Effect of English-Medium Learning for Korean Students

Peer reviewed

Direct link

Lee, Hyunjeong – Educational Psychology, 2014

This study investigated a reliable and valid method for measuring cognitive load during learning through comparing various types of cognitive load measurements: electroencephalography (EEG), self-reporting, and learning outcome. A total of 43 college-level students underwent watching a documentary delivered in English or in Korean. EEG was…

Descriptors: Cognitive Ability, Correlation, Brain Hemisphere Functions, Diagnostic Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Advances in Health Sciences…	3
ProQuest LLC	3
Research Papers in Education	3
Journal of Educational…	2
Learning and Individual…	2
Measurement in Physical…	2
Practical Assessment,…	2
AERA Online Paper Repository	1
Academic Medicine	1
Academic Psychiatry	1
American Annals of the Deaf	1
Analysis of Verbal Behavior	1
Assessment	1
Assessment Update	1
Assessment for Effective…	1
College and Research Libraries	1
Communication Education	1
Early Child Development and…	1
Educational Measurement:…	1
Educational Policy	1
Educational Psychology	1
Educational Psychology in…	1
Educational Review	1
Educational Studies	1
Educational Technology	1
More ▼

Adkins, Dorothy C.	1
Amery D. Wu	1
An, Lily Shiao	1
Anderson, Colette	1
Andersson, Marie	1
Archer, Julian	1
Arias-Estero, José Luis	1
Bahnemann, Markus	1
Barquero-Ruiz, Carmen	1
Benavidez, Charlotte	1
Blackwood, Nigel	1
Blair, Steven N.	1
Blanchard, Jay S.	1
Bloch, Barbara	1
Borowski, Andreas	1
Bowles, Heather R.	1
Bramley, Tom	1
Brigham, Frederick J.	1
Brigham, Michele St. Peter	1
Burmester, Kristen O'Rourke	1
CROMWELL, RUE L.	1
Carlisle, Sylvia	1
Carlson, Robert E.	1
More ▼