Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 15 |
Descriptor
Source
Author
| Booker, Kevin | 2 |
| Bruch, Julie | 2 |
| Gill, Brian | 2 |
| Barker, Pierce | 1 |
| Bauer, Ernest A. | 1 |
| Boyd, Donald | 1 |
| Brennan, Robert L. | 1 |
| Breton, Theodore R. | 1 |
| Chen, Troy T. | 1 |
| Chiang, Hanley S. | 1 |
| DeMars, Christine E. | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 14 |
| Journal Articles | 10 |
| Dissertations/Theses -… | 3 |
| Reports - Evaluative | 3 |
| Reports - Descriptive | 2 |
| Information Analyses | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Elementary Secondary Education | 6 |
| Elementary Education | 3 |
| High Schools | 3 |
| Higher Education | 3 |
| Grade 4 | 2 |
| Middle Schools | 2 |
| Postsecondary Education | 2 |
| Grade 3 | 1 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
| Researchers | 1 |
Location
| California | 2 |
| Arizona | 1 |
| Massachusetts | 1 |
| Michigan | 1 |
| Minnesota | 1 |
| New York | 1 |
| Texas | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Liu, Hsin-min – ProQuest LLC, 2014
One of the fundamental problems in language testing is the lack of adequate generalizability between what a test is measuring and what fulfills the learners' real world language use needs. It is important to recognize that no matter how precise a test measures a construct, if the way that a construct is defined and the way that test tasks are…
Descriptors: Reading Tests, Language Tests, Task Analysis, Generalizability Theory
Harvey, Donzel Wayne – ProQuest LLC, 2013
Purpose: The purpose of this study was to examine the college-readiness rates of Black, Hispanic, White, and Asian graduates of public secondary schools in Texas using archival data from the Texas Education Agency Academic Excellence Indicator System. Data examined were the average ACT and SAT scores for the past 10 school years (i.e., 2001-2002…
Descriptors: College Readiness, Ethnicity, Racial Differences, African American Students
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Gill, Brian; Bruch, Julie; Booker, Kevin – Regional Educational Laboratory Mid-Atlantic, 2013
States are increasingly interested in including measures of student achievement growth, or "value-
added," in evaluating teachers. Annual state assessments, however, which are the typical measure of student
growth, usually cover only reading and math teachers and only in grades 4-8. These state assessments thus cannot
…
Descriptors: Teacher Evaluation, Teacher Competencies, Evaluation Methods, Educational Testing
Gill, Brian; Bruch, Julie; Booker, Kevin – Regional Educational Laboratory Mid-Atlantic, 2013
States and school districts are exploring alternatives to state tests for measuring teachers' contributions to student learning. One approach applies statistical value-added methods to alternative student assessments such as commercially available tests and end-of course tests. The evidence suggests that these methods can reliably distinguish…
Descriptors: Teacher Evaluation, Teacher Competencies, Evaluation Methods, Educational Testing
Breton, Theodore R. – Economics of Education Review, 2011
This paper challenges Hanushek and Woessmann's (2008) contention that the quality and not the quantity of schooling determines a nation's rate of economic growth. I first show that their statistical analysis is flawed. I then show that when a nation's average test scores and average schooling attainment are included in a national income model,…
Descriptors: Economic Progress, Income, Statistical Significance, Educational Quality
Joseph, Dane Christian – ProQuest LLC, 2010
Multiple-choice item-writing guideline research is in its infancy. Haladyna (2004) calls for a science of item-writing guideline research. The purpose of this study is to respond to such a call. The purpose of this study was to examine the impact of student ability and method for varying the location of correct answers in classroom multiple-choice…
Descriptors: Evidence, Test Format, Guessing (Tests), Program Effectiveness
Schochet, Peter Z.; Chiang, Hanley S. – National Center for Education Evaluation and Regional Assistance, 2010
This paper addresses likely error rates for measuring teacher and school performance in the upper elementary grades using value-added models applied to student test score gain data. Using realistic performance measurement system schemes based on hypothesis testing, we develop error rate formulas based on OLS and Empirical Bayes estimators.…
Descriptors: Teacher Effectiveness, Teacher Evaluation, Student Evaluation, Scores
UCLA IDEA, 2012
Value added measures (VAM) uses changes in student test scores to determine how much "value" an individual teacher has "added" to student growth during the school year. Some policymakers, school districts, and educational advocates have applauded VAM as a straightforward measure of teacher effectiveness: the better a teacher,…
Descriptors: Teacher Effectiveness, Teacher Evaluation, Educational Testing, Standardized Tests
Munoz, Marco A.; Prather, Joseph R.; Stronge, James H. – Planning and Changing, 2011
Teacher effectiveness and evaluation using student growth measures is a popular reform strategy in education. Teachers can make a difference in student academic growth, but a question that begs an answer is how to go about measuring this impact. This study examines models of teacher effectiveness and the development of hierarchical linear models…
Descriptors: Reading Instruction, Elementary Education, Urban Schools, Teacher Effectiveness
Jarjoura, David – 1983
Issues regarding confidence and tolerance intervals are discussed within the context of educational measurement. Conceptual distinctions are drawn between these two types of intervals; and examples, under various error and true score models, are used to compare such intervals. It is shown that there tend to be only small differences in tolerance…
Descriptors: Educational Testing, Measurement Techniques, Models, Scores
Peer reviewedBrennan, Robert L. – Educational Measurement: Issues and Practice, 1997
The history of generalizability theory (G theory) is told from the perspective of one researcher's experiences, describing psychometric and scientific perspectives that influenced the development of G theory and its adoption. Work that remains to be done in the field is outlined. (SLD)
Descriptors: Educational Testing, Generalizability Theory, Measurement, Psychometrics
Kang, Taehoon; Chen, Troy T. – ACT, Inc., 2007
Orlando and Thissen (2000, 2003) proposed an item-fit index, S-X[superscript 2], for dichotomous item response theory (IRT) models, which has performed better than traditional item-fit statistics such as Yen's (1981) Q[subscript 1] and McKinley and Mill's (1985) G[superscript 2]. This study extends the utility of S-X[superscript 2] to polytomous…
Descriptors: Item Response Theory, Models, Computer Software, Statistical Analysis
Previous Page | Next Page »
Pages: 1 | 2
Direct link
