NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 44 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Andrew P. Jaciw – American Journal of Evaluation, 2025
By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…
Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Williams, Jazz C. – English in Education, 2015
Several inference types serving distinct purposes are established in the literature on reading comprehension. Despite this highlighting that inference is a non-unitary construct, reading tests tend to treat it as a single ability. Consequently, different tests can assess different inferential abilities. Professionals, knowing what is implicitly…
Descriptors: Inferences, Sentences, Reading Comprehension, Reading Tests
Keiffer, Elizabeth Ann – ProQuest LLC, 2011
A differential item functioning (DIF) simulation study was conducted to explore the type and level of impact that contamination had on type I error and power rates in DIF analyses when the suspect item favored the same or opposite group as the DIF items in the matching subtest. Type I error and power rates were displayed separately for the…
Descriptors: Test Items, Sample Size, Simulation, Identification
Herman, Joan L.; Heritage, Margaret; Goldschmidt, Pete – Assessment and Accountability Comprehensive Center, 2011
States and districts across the country are grappling with how to incorporate assessments of student learning into their teacher evaluation systems. Sophisticated statistical models have been proposed to estimate the relative value individual teachers add to their students' assessment performance (hence the term teacher "value-added" measures).…
Descriptors: Teacher Evaluation, Testing, Test Selection, Test Construction
Herman, Joan L.; Heritage, Margaret; Goldschmidt, Pete – Assessment and Accountability Comprehensive Center, 2011
States and districts across the country are grappling with how to incorporate assessments of student learning into their teacher evaluation systems. Sophisticated statistical models have been proposed to estimate the relative value individual teachers add to their students' assessment performance (hence the term teacher "value-added" measures).…
Descriptors: Teacher Evaluation, Testing, Test Selection, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Thomas, Michael L. – Assessment, 2011
Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…
Descriptors: Item Response Theory, Psychological Evaluation, Reliability, Error of Measurement
Herman, Joan L.; Osmundson, Ellen; Dietel, Ronald – Assessment and Accountability Comprehensive Center, 2010
The No Child Left Behind Act of 2001 (NCLB, 2002) has produced an explosion of interest in the use of assessment to measure and improve student learning. Initially focused on annual state tests, educators quickly learned that results came too little and too late to identify students who were falling behind. At the same time, evidence from the…
Descriptors: Federal Legislation, Formative Evaluation, Benchmarking, Educational Assessment
Herman, Joan L.; Osmundson, Ellen; Dietel, Ronald – Assessment and Accountability Comprehensive Center, 2010
This report describes the purposes of benchmark assessments and provides recommendations for selecting and using benchmark assessments--addressing validity, alignment, reliability, fairness and bias and accessibility, instructional sensitivity, utility, and reporting issues. We also present recommendations on building capacity to support schools'…
Descriptors: Multiple Choice Tests, Test Items, Benchmarking, Educational Assessment
Ekstrom, Ruth B. – 1979
Three areas of concern related to test bias and validity should be considered during the revision of the Standards for Educational and Psychological Tests. The first area concerns the sources and consequences of test bias. Five sources of bias have been identified: numerical bias, role bias, status bias, stereotypic bias, and familiarity bias. The…
Descriptors: Evaluation Criteria, Psychometrics, Test Bias, Test Construction
Joint Committee on Testing Practices, Washington, DC. – 1988
Guidelines for test developers and users are provided to insure that construction and selection of test instruments are conducted fairly. In addition to development and selection, issues of interpretation of scores; provision of information to test takers; and prevention of test bias based on race, gender, or ethnicity are addressed. Twenty-one…
Descriptors: Codes of Ethics, Educational Testing, Test Bias, Test Construction
Shermis, Mark D.; DiVesta, Francis J. – Rowman & Littlefield Publishers, Inc., 2011
"Classroom Assessment in Action" clarifies the multi-faceted roles of measurement and assessment and their applications in a classroom setting. Comprehensive in scope, Shermis and Di Vesta explain basic measurement concepts and show students how to interpret the results of standardized tests. From these basic concepts, the authors then…
Descriptors: Student Evaluation, Standardized Tests, Scores, Measurement
Jones, Arthur C. – 1975
Psychological testing as an area has perhaps evoked more controversy and heated emotion than has any other area within the fields of psychology and counseling. Part of the reason for this has to do with the inherent complexity and difficulty of the task of assessing human abilities, emotions and achievements. But beyond this basic issue, an…
Descriptors: Behavior Change, Behavioral Science Research, Conflict, History
Peer reviewed Peer reviewed
Green, Donald Ross – Education and Urban Society, 1975
States that to demonstrate that a test is not biased for any given use, it is sufficient to show that it is equally valid for different groups. Although an examination of criterion-related validity can indicate bias, it is not ordinarily sufficient to indicate lack of bias, for which explorations of its construct validity regardless of use are…
Descriptors: Evaluation Criteria, Placement, Predictive Validity, Program Descriptions
Peer reviewed Peer reviewed
Gold, Margaret G.; Bruno, Joseph F. – Education and Urban Society, 1975
Reviews the judicial-legal definition of test bias as it has emerged in court litigation, asserting that while technical research studies of test bias have focused either on predictive validity or content validity, the judicial-legal concepts of test bias have attempted to focus on both these issues in a relatively non-technical manner. (JM)
Descriptors: Court Litigation, Employment Opportunities, Employment Practices, Evaluation Criteria
Green, Donald Ross – 1975
Biased tests systematically favor some groups over others as a result of factors not part of what the test is said to measure. Bias is basically a problem of differential validity. Validity can be discussed in terms of either the procedures for establishing it or test use. Both ways clarify bias in any test. For content and construct validity, the…
Descriptors: Achievement Tests, Groups, Individual Differences, Placement
Previous Page | Next Page ยป
Pages: 1  |  2  |  3