ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	28

Descriptor

Classification	49
Educational Testing	49
Evaluation Methods	20
Educational Assessment	15
Measurement Techniques	15
Student Evaluation	13
Psychometrics	11
Item Response Theory	10
Measurement	10
Test Construction	9
Criterion Referenced Tests	8
Models	8
Evidence	7
Test Items	7
Testing Problems	7
Academic Achievement	6
Achievement Tests	6
Comparative Analysis	6
Diagnostic Tests	6
Foreign Countries	6
Scoring	6
Test Interpretation	6
Test Use	6
Computer Assisted Testing	5
Data Analysis	5
More ▼

Publication Type

Journal Articles	29
Opinion Papers	13
Reports - Evaluative	11
Reports - Research	7
Reports - Descriptive	4
Speeches/Meeting Papers	4
Dissertations/Theses -…	3
Information Analyses	3
Books	1
Guides - Classroom - Teacher	1
Guides - General	1
Numerical/Quantitative Data	1
Tests/Questionnaires	1
More ▼

Education Level

Elementary Secondary Education	12
Elementary Education	3
Higher Education	2
Grade 4	1
Grade 8	1
Postsecondary Education	1
Secondary Education	1

Audience

Practitioners	2
Administrators	1
Teachers	1

Location

United Kingdom	3
United States	3
California	2
United Kingdom (England)	2
United Kingdom (Wales)	2
Australia	1
Nebraska	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	2
California Achievement Tests	1
Differential Aptitude Test	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 49 results Save | Export

The Effect of Person Misfit on Item Parameter Estimation and Classification Accuracy: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Mousavi, Amin; Cui, Ying – Education Sciences, 2020

Often, important decisions regarding accountability and placement of students in performance categories are made on the basis of test scores generated from tests, therefore, it is important to evaluate the validity of the inferences derived from test results. One of the threats to the validity of such inferences is aberrant responding. Several…

Descriptors: Student Evaluation, Educational Testing, Psychological Testing, Item Response Theory

Reporting Proficiency Levels for Examinees with Incomplete Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2022

Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on…

Descriptors: Computation, Data Analysis, Educational Testing, Accuracy

Three Metrics for Monitoring Educational Progress When Tested Populations Change

Peer reviewed

Direct link

Andrew Ho – Teachers College Record, 2025

Background/Context: Public monitoring of educational progress and inequality often involves tracking changes in the percentage of "proficient" students across groups and over time. These trends are important signals of state and district provision of educational opportunity. I show how known flaws of this percentage metric, sometimes…

Descriptors: Educational Assessment, Progress Monitoring, Educational Trends, Educational Opportunities

Promoting Validity in the Assessment of English Learners

Peer reviewed

Direct link

Sireci, Stephen G.; Faulkner-Bond, Molly – Review of Research in Education, 2015

Across the globe, educational tests are being used at a rapidly increasing rate. More recently, educational tests are being used to inform educational policy and for holding educators accountable for student learning. One reason educational assessments are used for these important purposes is that they are considered to provide reliable and…

Descriptors: English Language Learners, Accountability, Educational Testing, Student Evaluation

A Comparison of Computer-Based Classification Testing Approaches Using Mixed-Format Tests with the Generalized Partial Credit Model

Direct link

Kim, Jiseon – ProQuest LLC, 2010

Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…

Descriptors: Test Length, Computer Assisted Testing, Classification, Probability

Item Parameter Drift as an Indication of Differential Opportunity to Learn: An Exploration of Item Flagging Methods & Accurate Classification of Examinees

Direct link

Sukin, Tia M. – ProQuest LLC, 2010

The presence of outlying anchor items is an issue faced by many testing agencies. The decision to retain or remove an item is a difficult one, especially when the content representation of the anchor set becomes questionable by item removal decisions. Additionally, the reason for the aberrancy is not always clear, and if the performance of the…

Descriptors: Simulation, Science Achievement, Sampling, Data Analysis

A Proposed Framework of Test Administration Methods

Peer reviewed

Direct link

Thompson, Nathan A. – Journal of Applied Testing Technology, 2008

The widespread application of personal computers to educational and psychological testing has substantially increased the number of test administration methodologies available to testing programs. Many of these mediums are referred to by their acronyms, such as CAT, CBT, CCT, and LOFT. The similarities between the acronyms and the methods…

Descriptors: Testing Programs, Psychological Testing, Classification, Educational Testing

Profile Analysis: A Closer Look at the PISA 2000 Reading Data

Peer reviewed

Direct link

Verhelst, Norman D. – Scandinavian Journal of Educational Research, 2012

When using IRT models in Educational Achievement Testing, the model is as a rule too simple to catch all the relevant dimensions in the test. It is argued that a simple model may nevertheless be useful but that it can be complemented with additional analyses. Such an analysis, called profile analysis, is proposed and applied to the reading data of…

Descriptors: Multidimensional Scaling, Profiles, Item Response Theory, Achievement Tests

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

Diagnostic Classification Models and Multidimensional Adaptive Testing: A Commentary on Rupp and Templin

Peer reviewed

Direct link

Frey, Andreas; Carstensen, Claus H. – Measurement: Interdisciplinary Research and Perspectives, 2009

On a general level, the objective of diagnostic classifications models (DCMs) lies in a classification of individuals regarding multiple latent skills. In this article, the authors show that this objective can be achieved by multidimensional adaptive testing (MAT) as well. The authors discuss whether or not the restricted applicability of DCMs can…

Descriptors: Adaptive Testing, Test Items, Classification, Psychometrics

What Dictates the Meaning of Test Linking? A Reaction to "Thinking about Linking"

Peer reviewed

Direct link

von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010

The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…

Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

An Analysis of the Use and Validity of Test-Based Teacher Evaluations Reported by the "Los Angeles Times": 2011

Peer reviewed
PDF on ERIC

Download full text

Durso, Catherine S. – National Education Policy Center, 2012

In May of 2011, the "Los Angeles Times" published, for the second time, results of statistical studies examining the variation in teacher and school performance in the Los Angeles Unified School District, based on the California Standards Tests for math and English Language Arts (ELA). The studies use data from the seven academic years…

Descriptors: School Effectiveness, Teacher Effectiveness, Newspapers, News Reporting

Improving Marking Quality through a Taxonomy of Mark Schemes

Peer reviewed

Direct link

Ahmed, Ayesha; Pollitt, Alastair – Assessment in Education: Principles, Policy & Practice, 2011

At the heart of most assessments lies a set of questions, and those who write them must achieve "two" things. Not only must they ensure that each question elicits the kind of performance that shows how "good" pupils are at the subject, but they must also ensure that each mark scheme gives more marks to those who are…

Descriptors: Academic Achievement, Classification, Educational Quality, Quality Assurance

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Measurement:…	10
ProQuest LLC	3
Review of Research in…	2
American Psychologist	1
Applied Measurement in…	1
Assessment in Education:…	1
Audiovisual Instruction	1
Behavioral & Social Sciences…	1
College Student Journal	1
Educ Sci Int J	1
Education Next	1
Education Sciences	1
Educational Leadership	1
Educational Measurement:…	1
Eye on Education	1
Journal of Applied Testing…	1
Journal of Educational and…	1
Journal of Technology,…	1
Mathematical Spectrum	1
National Education Policy…	1
Psychological Assessment	1
Scandinavian Journal of…	1
Studies in Educational…	1
Teachers College Record	1
Topics in Early Childhood…	1
More ▼

Cui, Ying	2
Sinharay, Sandip	2
Ahmed, Ayesha	1
Airola, Denise Tobin	1
Andrew Ho	1
Bagnato, Stephen J.	1
Beatty-Guenter, P.	1
Bechger, Timo	1
Beresford, Lauren	1
Buckendahl, Chad W.	1
Carstensen, Claus H.	1
Cresswell, Mike	1
DAVIS, O.L., JR.	1
Dirlam, David K.	1
Durso, Catherine S.	1
Dwyer, Carol Anne	1
English, Donald E.	1
Faulkner-Bond, Molly	1
Frey, Andreas	1
Gierl, Mark J.	1
Gifford, Bernard	1
Grossman, Cheryl R. Sturko	1
Haberman, Shelby J.	1
Jackson, Paul H.	1
More ▼