ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	16

Descriptor

Test Reliability	16
Value Added Models	16
Test Validity	11
Teacher Evaluation	8
Scores	6
Correlation	4
Evaluation Methods	4
Teacher Effectiveness	4
Test Bias	4
Academic Achievement	3
School Districts	3
Teacher Attitudes	3
Teacher Responsibility	3
Academically Gifted	2
Achievement Tests	2
Alternative Assessment	2
Computation	2
Error of Measurement	2
Ethnicity	2
Factor Analysis	2
Feedback (Response)	2
Gender Differences	2
Limited English Speaking	2
Observation	2
Racial Differences	2
More ▼

Source

Online Submission	2
AERA Online Paper Repository	1
Annenberg Institute for…	1
Arts Education Policy Review	1
ETS Research Report Series	1
Education Policy Analysis…	1
Educational Assessment,…	1
Educational Measurement:…	1
Investigations in Mathematics…	1
Journal of Experimental…	1
Language Testing	1
Malaysian Online Journal of…	1
Phi Delta Kappan	1
SAGE Open	1
Studies in Higher Education	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	12
Reports - Evaluative	3
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Grade 3	1
Two Year Colleges	1

Audience

Location

Texas (Houston)	3
Texas (Austin)	2
Malaysia	1
Michigan	1
New Mexico	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Methodological Concerns about the Education Value-Added Assessment System (EVAAS): Validity, Reliability, and Bias

Peer reviewed

Direct link

Amrein-Beardsley, Audrey; Geiger, Tray – SAGE Open, 2020

The Education Value-Added Assessment System (EVAAS), the value-added model (VAM) sold by the international business analytics software company SAS Institute Inc., is advertised as offering "precise, reliable and unbiased results that go far beyond what other simplistic [value-added] models found in the market today can provide." In this…

Descriptors: Value Added Models, Test Validity, Test Reliability, Test Bias

Using Test Scores to Evaluate and Hold School Teachers Accountable in New Mexico

Peer reviewed

Direct link

Geiger, Tray J.; Amrein-Beardsley, Audrey; Holloway, Jessica – Educational Assessment, Evaluation and Accountability, 2020

For this study, researchers critically reviewed documents pertaining to the highest profile of the 15 teacher evaluation lawsuits that occurred throughout the U.S. as pertaining to the use of student test scores to evaluate teachers. In New Mexico, teacher plaintiffs contested how they were being evaluated and held accountable using a homegrown…

Descriptors: Court Litigation, Teacher Responsibility, Accountability, Value Added Models

The Education Value-Added Assessment System: Methodological Issues and Implications for Policy and Practicality

Peer reviewed

Direct link

Geiger, Tray; Amerein-Beardsley, Audrey – AERA Online Paper Repository, 2017

The Education Value-Added Assessment System (EVAAS), the value-added model (VAM) sold by the business analytics software company SAS Institute Inc., is advertised as offering "precise, reliable and unbiased results that go far beyond what other simplistic [value-added] models found in the market today can provide." In this study, we…

Descriptors: Value Added Models, Test Validity, Test Reliability, Teacher Evaluation

Exploration of Factors Affecting the Added Value of Test Subscores

Peer reviewed

Direct link

Wang, Xiaolin; Svetina, Dubravka; Dai, Shenghai – Journal of Experimental Education, 2019

Recently, interest in test subscore reporting for diagnosis purposes has been growing rapidly. The two simulation studies here examined factors (sample size, number of subscales, correlation between subscales, and three factors affecting subscore reliability: number of items per subscale, item parameter distribution, and data generating model)…

Descriptors: Value Added Models, Scores, Sample Size, Correlation

Music Teachers' Perceptions of High Stakes Teacher Evaluation

Peer reviewed

Direct link

Robinson, Mitchell – Arts Education Policy Review, 2019

Recent corporate education reform policies have replaced relatively informal systems of principal observations that had been familiar to many teachers for much of their professional careers with high-stakes teacher evaluation (HSTE) systems that now determine who is allowed to remain in the profession and who gets terminated. Many education…

Descriptors: Music, Music Education, Music Teachers, Teacher Attitudes

Integrating Active Learning Labs in Precalculus: Measuring the Value Added

Peer reviewed

Direct link

Bowers, Janet; Smith, Wendy; Ren, Lixin; Hanna, Robert – Investigations in Mathematics Learning, 2019

The need to incorporate active learning (AL) in higher education has become a prominent issue discussed by major leadership organizations such as the Conference Board of Mathematical Sciences (CBMS, 2016). These calls for AL are based on a large and growing body of research documenting the correlation between AL use and reduced failure rates,…

Descriptors: Active Learning, Calculus, College Mathematics, Mathematics Instruction

Statistical Properties of the "GRE"® Psychology Test Subscores. ETS GRE® Board Research Report. ETS GRE®-18-02. ETS Research Report. RR-18-19

Peer reviewed
PDF on ERIC

Download full text

Liu, Yuming; Robin, Frédéric; Yoo, Hanwook; Manna, Venessa – ETS Research Report Series, 2018

The "GRE"® Psychology test is an achievement test that measures core knowledge in 12 content domains that represent the courses commonly offered at the undergraduate level. Currently, a total score and 2 subscores, experimental and social, are reported to test takers as well as graduate institutions. However, the American Psychological…

Descriptors: College Entrance Examinations, Graduate Study, Psychological Testing, Scores

Measuring the Effectiveness of Two-Year Colleges: A Comparison of Raw and Value-Added Performance Indicators

Peer reviewed

Direct link

Horn, Aaron S.; Horner, Olena G.; Lee, Giljae – Studies in Higher Education, 2019

Researchers in higher education frequently evaluate institutional effectiveness as the difference between an actual and predicted graduation rate, but little is known about whether such a method is reliable or valid. This study examines the measurement properties of effectiveness scores derived from regression residuals for community colleges in…

Descriptors: Instructional Effectiveness, Two Year Colleges, Comparative Analysis, Raw Scores

The Mathematics Values in Classroom Inventory: Development and Initial Validation

Peer reviewed
PDF on ERIC

Download full text

Tapsir, Ruzela; Nik Azis, Nik Pa – Malaysian Online Journal of Educational Sciences, 2017

Value has been identified as an essential aspect towards the quality in mathematics education at various levels of the system, institutional, curriculum, education management, and classroom interactions. However, few studies were focused on values, its development, measurement, and impact in education as compared to other affective aspects such as…

Descriptors: Focus Groups, Mathematics Education, Value Added Models, Test Construction

Do the TOEFL iBT® Section Scores Provide Value-Added Information to Stakeholders

Peer reviewed

Direct link

Sawaki, Yasuyo; Sinharay, Sandip – Language Testing, 2018

The present study examined the reliability of the reading, listening, speaking, and writing section scores for the TOEFL iBT® test and their interrelationship in order to collect empirical evidence to support, respectively, the "generalization" inference and the "explanation" inference in the TOEFL iBT validity argument…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing

Between Scylla and Charybdis: Reflections on and Problems Associated with the Evaluation of Teachers in an Era of Metrification

Peer reviewed
PDF on ERIC

Download full text

Berliner, David C. – Education Policy Analysis Archives, 2018

The Scylla and Charybdis in this discussion of teacher evaluation are standardized achievement test data on the one hand, and classroom observational systems on the other. These are the two most common methods used to judge teachers' competency. Both have serious flaws: the former primarily with validity, the latter primarily with reliability. At…

Descriptors: Teacher Evaluation, Evaluation Problems, Standardized Tests, Achievement Tests

Measurement Validity and Reliability of Professional Pathways for Teachers: Technical Report. Publication 18.17

Download full text

Hutchins, Shaun D. – Online Submission, 2019

The purpose of this Professional Pathways for Teachers (PPfT) evaluation was to examine the measurement validity and reliability of PPfT appraisal data from the 2017-2018 school year in the Austin Independent School District. The PPfT appraisal is a multi-measure system that covers three areas: instructional practices (IP), professional growth and…

Descriptors: Test Validity, Test Reliability, School Districts, Teacher Evaluation

Measurement Validity and Reliability of Professional Pathways for Teachers: Research Brief. Publication 18.17 RB

Download full text

Hutchins, Shaun D. – Online Submission, 2019

Descriptors: Test Validity, Test Reliability, School Districts, Teacher Evaluation

The Accuracy of Aggregate Student Growth Percentiles as Indicators of Educator Performance

Peer reviewed

Direct link

Castellano, Katherine E.; McCaffrey, Daniel F. – Educational Measurement: Issues and Practice, 2017

Mean or median student growth percentiles (MGPs) are a popular measure of educator performance, but they lack rigorous evaluation. This study investigates the error in MGP due to test score measurement error (ME). Using analytic derivations, we find that errors in the commonly used MGP are correlated with average prior latent achievement: Teachers…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Value Added Models, Achievement Gains

Previous Page | Next Page »

Pages: 1 | 2

Amrein-Beardsley, Audrey	3
Geiger, Tray	3
Hutchins, Shaun D.	2
Amerein-Beardsley, Audrey	1
Benjamin W. Domingue	1
Berliner, David C.	1
Bowers, Janet	1
Castellano, Katherine E.	1
Dai, Shenghai	1
Geiger, Tray J.	1
Hanna, Robert	1
Holloway, Jessica	1
Horn, Aaron S.	1
Horner, Olena G.	1
James G. Soland	1
Joshua B. Gilbert	1
Lee, Giljae	1
Liu, Yuming	1
Manna, Venessa	1
McCaffrey, Daniel F.	1
Nik Azis, Nik Pa	1
Ren, Lixin	1
Robin, Frédéric	1
Robinson, Mitchell	1
Sawaki, Yasuyo	1
More ▼