NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 61 to 75 of 300 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Carlson, Janet F.; Geisinger, Kurt F. – International Journal of Testing, 2012
The test review process used by the Buros Center for Testing is described as a series of 11 steps: (1) identifying tests to be reviewed, (2) obtaining tests and preparing test descriptions, (3) determining whether tests meet review criteria, (4) identifying appropriate reviewers, (5) selecting reviewers, (6) sending instructions and materials to…
Descriptors: Testing, Test Reviews, Evaluation Methods, Evaluation Criteria
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Sawchuk, Stephen – Education Week, 2010
Most experts in the testing community have presumed that the $350 million promised by the U.S. Department of Education to support common assessments would promote those that made greater use of open-ended items capable of measuring higher-order critical-thinking skills. But as measurement experts consider the multitude of possibilities for an…
Descriptors: Test Items, Federal Legislation, Scoring, Accountability
Peer reviewed Peer reviewed
Direct linkDirect link
Chiavaroli, Neville; Familari, Mary – Bioscience Education, 2011
This paper outlines the use of item analysis to assist examiners in evaluating the quality and validity of their MCQ exam questions. The generation of item analysis, particularly discrimination index, has long been established practice in professional testing and credentialing organisations and some disciplines in tertiary education, but its use…
Descriptors: Self Actualization, Time Management, Audiences, Museums
Peer reviewed Peer reviewed
Direct linkDirect link
Dyson, Ben; Placek, Judith H.; Graber, Kim C.; Fisette, Jennifer L.; Rink, Judy; Zhu, Weimo; Avery, Marybell; Franck, Marian; Fox, Connie; Raynes, De; Park, Youngsik – Measurement in Physical Education and Exercise Science, 2011
This article describes how assessments in PE Metrics were developed following six steps: (a) determining test blueprint, (b) writing assessment tasks and scoring rubrics, (c) establishing content validity, (d) piloting assessments, (e) conducting item analysis, and (f) modifying the assessments based on analysis and expert opinion. A task force,…
Descriptors: Expertise, Evidence, Physical Education, Elementary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Muniz, Jose; Fernandez-Hermida, Jose R.; Fonseca-Pedrero, Eduardo; Campillo-Alvarez, Angela; Pena-Suarez, Elsa – International Journal of Testing, 2012
The proper use of psychological tests requires that the measurement instruments have adequate psychometric properties, such as reliability and validity, and that the professionals who use the instruments have the necessary expertise. In this article, we present the first review of tests published in Spain, carried out with an assessment model…
Descriptors: Student Evaluation, Measurement, Foreign Countries, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Little, Mary E. – Educational Forum, 2012
The purpose of this article is to define and clarify the process of instructional problem-solving using assessment data within action research (AR) and Response to Intervention (RtI). Similarities between AR and RtI are defined and compared. Lastly, specific resources and examples of the instructional problem-solving process of AR within…
Descriptors: Intervention, Action Research, Problem Solving, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
el-Guebaly, Nady; Violato, Claudio – Substance Abuse, 2011
The experience of the International Society of Addiction Medicine in setting up the first international certification of clinical knowledge is reported. The steps followed and the results of a psychometric analysis of the tests from the first 65 candidates are reported. Lessons learned in the first 5 years and challenges for the future are…
Descriptors: Psychometrics, Certification, Substance Abuse, Medicine
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011
The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…
Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability
Foy, Pierre, Ed.; Arora, Alka, Ed.; Stanco, Gabrielle M., Ed. – International Association for the Evaluation of Educational Achievement, 2013
This supplement describes national adaptations made to the international version of the TIMSS 2011 background questionnaires. This information provides users with a guide to evaluate the availability of internationally comparable data for use in secondary analyses involving the TIMSS 2011 background variables. Background questionnaire adaptations…
Descriptors: Questionnaires, Technology Transfer, Adoption (Ideas), Media Adaptation
Peer reviewed Peer reviewed
Direct linkDirect link
Forster, Kenneth I. – Journal of Memory and Language, 2008
It is commonly assumed that a significant item analysis (F2) provides an assurance that the treatment effect is generalizable to the population of items from which the items were drawn, which in turn implies that the effect is reasonably general across items. The latter implication is shown to be false, and it is argued that a new test of…
Descriptors: Item Analysis, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Kreiner, Svend – Applied Psychological Measurement, 2011
To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…
Descriptors: Item Analysis, Correlation, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Pounder, Diana – Journal of Research on Leadership Education, 2012
This article addresses the leadership preparation line of inquiry developed in the past decade by the University Council for Educational Administration/Learning and Teaching in Educational Leadership Special Interest Group Taskforce on Evaluating Leadership Preparation Programs, and it particularly addresses the series of survey instruments…
Descriptors: Administrator Education, Educational Administration, Instructional Leadership, Program Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Ketterlin-Geller, Leanne R.; Yovanoff, Paul; Jung, EunJu; Liu, Kimy; Geller, Josh – Educational Assessment, 2013
In this article, we highlight the need for a precisely defined construct in score-based validation and discuss the contribution of cognitive theories to accurately and comprehensively defining the construct. We propose a framework for integrating cognitively based theoretical and empirical evidence to specify and evaluate the construct. We apply…
Descriptors: Test Validity, Construct Validity, Scores, Evidence
Peer reviewed Peer reviewed
Direct linkDirect link
Gonyea, Robert M.; Miller, Angie – New Directions for Institutional Research, 2011
Correlations between self-reported learning gains and direct, longitudinal measures that ostensibly correspond in content area are generally inadequate. This chapter clarifies that self-reported measures of learning are more properly used and interpreted as evidence of students' perceived learning and affective outcomes. In this context, the…
Descriptors: Evidence, College Students, Institutional Research, Social Desirability
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  20