ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	21

Descriptor

Evaluation Methods	32
Evaluation Problems	32
Testing Problems	32
Educational Assessment	15
Evaluation Research	15
Test Validity	15
Measurement	14
Psychometrics	14
Teacher Evaluation	12
Test Construction	11
Educational Testing	10
Knowledge Base for Teaching	10
Mathematics Education	10
Measurement Techniques	10
Mathematics Instruction	9
Pedagogical Content Knowledge	9
Student Evaluation	7
Elementary Secondary Education	5
Item Response Theory	5
Test Reliability	5
Testing Programs	5
Foreign Countries	4
Norm Referenced Tests	4
Teacher Competency Testing	4
Test Items	4
More ▼

Source

Measurement:…	9
Journal of Educational…	4
American Educational Research…	1
Applied Measurement in…	1
Assessment for Effective…	1
Canadian Journal of Program…	1
Educational Measurement:…	1
Educational Research and…	1
Educational Research for…	1
International Journal of…	1
International Journal on…	1
Language, Speech, and Hearing…	1
Mathematics Teacher Education…	1
Open Praxis	1
Science & Education	1
More ▼

Publication Type

Journal Articles	26
Opinion Papers	12
Reports - Research	9
Reports - Evaluative	7
Reports - Descriptive	2
ERIC Digests in Full Text	1
ERIC Publications	1
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	14
Elementary Education	3
Grade 3	1
Grade 4	1
Grade 5	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

California	1
Canada	1
Denmark	1
Germany	1
Poland	1
Sweden	1
Thailand	1
Turkey	1

Laws, Policies, & Programs

Education Consolidation…

Assessments and Surveys

Advanced Placement…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

Rethinking Online Assessment Quality from Pre-Service Teachers Perspectives

Peer reviewed
PDF on ERIC

Download full text

Mücahit Öztürk – Open Praxis, 2024

This study examined the problems that pre-service teachers face in the online assessment process and their suggestions for solutions to these problems. The participants were 136 pre-service teachers who have been experiencing online assessment for a long time and who took the Foundations of Open and Distance Learning course. This research is a…

Descriptors: Foreign Countries, Preservice Teacher Education, Preservice Teachers, Distance Education

Challenges and Strategies for Assessing Specialised Knowledge for Teaching

Peer reviewed
PDF on ERIC

Download full text

Orrill, Chandra Hawley; Kim, Ok-Kyeong; Peters, Susan A.; Lischka, Alyson E.; Jong, Cindy; Sanchez, Wendy B.; Eli, Jennifer A. – Mathematics Teacher Education and Development, 2015

Developing and writing assessment items that measure teachers' knowledge is an intricate and complex undertaking. In this paper, we begin with an overview of what is known about measuring teacher knowledge. We then highlight the challenges inherent in creating assessment items that focus specifically on measuring teachers' specialised knowledge…

Descriptors: Specialization, Knowledge Base for Teaching, Educational Strategies, Testing Problems

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

The Applicability of Multidimensional Computerized Adaptive Testing for Cognitive Ability Measurement in Organizational Assessment

Peer reviewed

Direct link

Makransky, Guido; Glas, Cees A. W. – International Journal of Testing, 2013

Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…

Descriptors: Foreign Countries, Cognitive Ability, Adaptive Testing, Feedback (Response)

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

Peer reviewed

Direct link

de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…

Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates across Outcome Measures

Peer reviewed

Direct link

Papay, John P. – American Educational Research Journal, 2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…

Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Technology Enhanced Distributive Formative Evaluation

Peer reviewed

Direct link

Moore, David Richard – International Journal on E-Learning, 2008

Quality assurance in instructional development demands an exhaustive formative evaluation effort and applied testing. Unfortunately, this process is expensive and requires large numbers of user testers with characteristics similar to the intended audience. This article presents a procedure for increasing the efficiency of quality assurance efforts…

Descriptors: Instructional Development, Formative Evaluation, Quality Control, Technology Integration

Assessing Complex Behaviors: Problems with Reification, Quantification, and Ranking.

Peer reviewed

Kambi, Alan G. – Language, Speech, and Hearing Services in Schools, 1993

This commentary on the mismeasurement of language and reading comprehension abilities argues that quantitative measures of complex behaviors and subsequent ranking of individual performance often do not accurately reflect the abstract constructs they purport to measure, and inappropriate quantification and ranking create and perpetuate potentially…

Descriptors: Elementary Secondary Education, Evaluation Criteria, Evaluation Methods, Evaluation Problems

A Practical and Prescriptive Approach to Validity--Commentary

Peer reviewed

Direct link

DiBello, Lou; Stout, William – Measurement: Interdisciplinary Research and Perspectives, 2007

In this article, the authors provide their critique on a set of papers that investigated Mathematics Knowledge for Teachers (MKT) assessment and the underlying theory and characteristics of the validity enterprise. Three types of assumptions and inferences--elemental, structural, and ecological--are discussed in these papers. These assumptions…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Our Field Needs a Framework to Guide Development of Validity Research Agendas and Identification of Validity Research Questions and Threats to Validity

Peer reviewed

Direct link

Ferrara, Steve – Measurement: Interdisciplinary Research and Perspectives, 2007

In this issue of Measurement: Interdisciplinary Research and Perspectives, Schilling et al. are explicit about the centrality of assessment design and development and psychometric analysis in validation. Schilling and colleagues, Kane (2004, 2006), other contemporary validity theorists and practitioners, and their predecessors typically discuss…

Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research

Previous Page | Next Page »

Pages: 1 | 2 | 3

Bielinski, John	2
Minnema, Jane	2
Thurlow, Martha	2
Alonzo, Alicia C.	1
Ascher, Carol	1
Baldwin, Su G.	1
Camilli, Gregory	1
Childs, Ruth A.	1
Clauser, Brian E.	1
Cui, Ying	1
DiBello, Lou	1
Dillon, Gerard F.	1
Eli, Jennifer A.	1
Engelhard, George, Jr.	1
Fawkes, Don	1
Ferrara, Steve	1
Fischer, Martin A.	1
Flage, Dan	1
Gattamorta, Karina	1
Gearhart, Maryl	1
Glas, Cees A. W.	1
Harn, William E.	1
Hill, Heather C.	1
Hunter, Darryl M.	1
More ▼