Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 44 |
| Since 2017 (last 10 years) | 118 |
| Since 2007 (last 20 years) | 619 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 79 |
| Researchers | 62 |
| Teachers | 40 |
| Administrators | 33 |
| Policymakers | 27 |
| Community | 3 |
| Parents | 2 |
| Students | 2 |
| Counselors | 1 |
| Media Staff | 1 |
Location
| United Kingdom | 42 |
| Australia | 41 |
| Canada | 34 |
| United Kingdom (England) | 28 |
| United States | 25 |
| Florida | 19 |
| California | 13 |
| China | 11 |
| Texas | 11 |
| Germany | 9 |
| Tennessee | 9 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Davis, Andrew – Journal of Philosophy of Education, 2009
What is "fairness" in the context of educational assessment? I apply this question to a number of contemporary educational assessment practices and policies. My approach to philosophy of education owes much to Wittgenstein. A commentary set apart from the main body of the paper focuses on my style of philosophising. Wittgenstein teaches us to…
Descriptors: Educational Assessment, Test Validity, Equal Education, Value Judgment
Finn, Jeremy D. – Education and the Public Interest Center, 2010
In 2002, voters in Florida approved a constitutional amendment limiting class sizes in public schools to 18 students in the elementary grades, 22 students in middle grades, and 25 in high school grades. Analyzing statewide achievement data for school districts from 2004-2006 and for schools in 2007, this study purports to find that "mandated…
Descriptors: Class Size, Small Classes, Program Effectiveness, Educational Policy
Saunders, Lesley – Educational Assessment, Evaluation and Accountability, 2010
This paper is a reflection on practice. It begins by briefly describing an evaluation of an externally-funded education programme in Kosovo (a new country in south-east Europe). The programme was managed by Save the Children in Kosovo and aimed to develop and promote models of inclusive education through three strands of activity. The first of…
Descriptors: Municipalities, Educational Needs, Evaluators, Inclusive Schools
Rust, Chris – Assessment & Evaluation in Higher Education, 2007
There is already a growing literature on assessment and an emergent scholarship that people in the scholarship of teaching and learning (SoTL) community need both to promote and to build on if assessment is to be a scholarly activity. Awareness alone, however, is unlikely to be enough. The poor practice highlighted in this article is not simply…
Descriptors: Scholarship, Academic Discourse, Educational Practices, Evaluation Needs
Green, Susan K.; Johnson, Robert L.; Kim, Do-Hong; Pope, Nakia S. – Teaching and Teacher Education: An International Journal of Research and Studies, 2007
Student evaluations should "be ethical, fair, useful, feasible, and accurate" [JCSEE (2003). "The student evaluation standards." Arlen Gullickson, Chair. Thousand Oaks, CA: Corwin]. This study focuses on defining ethical behavior and examining educators' ethical judgments in relation to assessment. It describes the results from…
Descriptors: Student Evaluation, Ethics, Evaluation Methods, Teacher Behavior
Bladon, Teresa L. – Canadian Journal of Program Evaluation, 2009
Rapidly declining response rates and the associated threat of nonresponse bias call into question the validity of data obtained through telephone surveys, a tool often used in evaluation. This article explores changes in nonresponse bias over time by examining three data points (1991, 1996, and 2002) from an annual household telephone survey…
Descriptors: Telephone Surveys, Foreign Countries, Response Rates (Questionnaires), Trend Analysis
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Dyrness, Ruth; Dyrness, Albert – Kappa Delta Pi Record, 2008
The middle school years are among the most formidable, as students transition from youth to adulthood. Self-esteem, consequently, is a significant attribute in developing the strong character necessary to successfully navigate middle school. Because a student's grades can influence how the student feels about his or her ability to learn, teachers…
Descriptors: Middle Schools, Grades (Scholastic), Academic Achievement, Grading
House, Ernest R. – American Journal of Evaluation, 2008
Drug studies are often cited as the best exemplars of evaluation design. However, many of these studies are seriously biased in favor of positive findings for the drugs evaluated, even to the point where dangerous effects are hidden. In spite of using randomized designs and double blinding, drug companies have found ways of producing the results…
Descriptors: Integrity, Evaluation Methods, Program Evaluation, Experimenter Characteristics
Walker, Gary; Kubisch, Anne C.; Bruner, Charles; Sridharan, Sanjeev; Philliber, Susan; Shaw, Greg; Dichter, Harriet – American Journal of Evaluation, 2008
The Case of Top Beginnings and the Missing Child Outcomes is a fictionalized case study of a set of evaluation challenges faced in evaluating comprehensive initiatives that are seeking to build systems. In this case, the issue is a multistate foundation-funded early learning system building initiative that has to date primarily employed a case…
Descriptors: Evaluators, Formative Evaluation, Systems Building, Case Method (Teaching Technique)
Elliott, Stephen N.; Gresham, Frank M.; Frank, Jennifer L.; Beddow, Peter A., III – Assessment for Effective Intervention, 2008
The term "intervention validity" refers to the extent to which assessment results can be used to guide the selection of interventions and evaluation of outcomes. In this article, the authors review the defining attributes of rating scales that distinguish them from other assessment tools, assumptions regarding the use of rating scales to…
Descriptors: Intervention, Social Behavior, Validity, Behavior Rating Scales
Wong, Pia Lindquist; Glass, Ronald David – Yearbook of the National Society for the Study of Education, 2011
A central commitment for professional development schools (PDSs) is to link preservice teacher preparation and in-service teacher professional development with improved learning outcomes for pupils. PDSs are expected to improve student achievement in two primary ways: (1) by enriching and intensifying the learning environment through professional…
Descriptors: Student Teachers, Professional Development Schools, Mentors, Academic Achievement
Tennessee Department of Education, 2012
In the summer of 2011, the Tennessee Department of Education contracted with the National Institute for Excellence in Teaching (NIET) to provide a four-day training for all evaluators across the state. NIET trained more than 5,000 evaluators intensively in the state model (districts using alternative instruments delivered their own training).…
Descriptors: Video Technology, Feedback (Response), Evaluators, Interrater Reliability
Marjanovic, Sonja; Hanney, Stephen; Wooding, Steven – RAND Corporation, 2009
This report critically examines studies of how scientific research drives innovation which is then translated into socio-economic benefits. It focuses on research evaluation insights that are relevant not only to the academic community, but also to policymakers and evaluation practitioners--and particularly to biomedical and health research…
Descriptors: Scientific Research, Evaluation Research, Research and Development, Research Reports
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology

Peer reviewed
Direct link
