ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	11

Descriptor

Evaluation Methods	22
Test Items	22
Testing Problems	22
Test Bias	10
Test Construction	10
Test Format	6
Educational Testing	4
Evaluation Problems	4
Evaluation Research	4
Item Analysis	4
Test Reliability	4
Computer Assisted Testing	3
Educational Assessment	3
Foreign Countries	3
Item Response Theory	3
Psychometrics	3
Student Evaluation	3
Test Validity	3
Accounting	2
Adaptive Testing	2
Barriers	2
Cognitive Tests	2
Computer Software	2
Culture Fair Tests	2
Difficulty Level	2
More ▼

Source

Journal of Educational…	3
Educational Measurement:…	2
Alberta Journal of…	1
ETS Research Report Series	1
Educational Research and…	1
Educational Technology &…	1
Educational and Psychological…	1
English Teaching Forum	1
Evaluation Review	1
Instructional Science	1
Mathematics Teacher Education…	1
Social Studies	1
More ▼

Publication Type

Journal Articles	15
Reports - Research	9
Reports - Evaluative	7
Reports - Descriptive	2
Speeches/Meeting Papers	2
Guides - Classroom - Learner	1
Guides - Classroom - Teacher	1
Guides - Non-Classroom	1
Information Analyses	1
Opinion Papers	1
Tests/Questionnaires	1
More ▼

Education Level

Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1

Audience

Practitioners	2
Teachers	2
Students	1

Location

Canada	1
Netherlands	1
South Africa	1

Laws, Policies, & Programs

Education for All Handicapped…

Assessments and Surveys

Wechsler Adult Intelligence…

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

A Review of Subscore Estimation Methods. ETS RR-18-17

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018

Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…

Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items

Challenges and Strategies for Assessing Specialised Knowledge for Teaching

Peer reviewed
PDF on ERIC

Download full text

Orrill, Chandra Hawley; Kim, Ok-Kyeong; Peters, Susan A.; Lischka, Alyson E.; Jong, Cindy; Sanchez, Wendy B.; Eli, Jennifer A. – Mathematics Teacher Education and Development, 2015

Developing and writing assessment items that measure teachers' knowledge is an intricate and complex undertaking. In this paper, we begin with an overview of what is known about measuring teacher knowledge. We then highlight the challenges inherent in creating assessment items that focus specifically on measuring teachers' specialised knowledge…

Descriptors: Specialization, Knowledge Base for Teaching, Educational Strategies, Testing Problems

Twenty Common Testing Mistakes for EFL Teachers to Avoid

Download full text

Henning, Grant – English Teaching Forum, 2012

To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…

Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation

The Effect of Missing Data Treatment on Mantel-Haenszel DIF Detection

Peer reviewed

Direct link

Emenogu, Barnabas C.; Falenchuk, Olesya; Childs, Ruth A. – Alberta Journal of Educational Research, 2010

Most implementations of the Mantel-Haenszel differential item functioning procedure delete records with missing responses or replace missing responses with scores of 0. These treatments of missing data make strong assumptions about the causes of the missing data. Such assumptions may be particularly problematic when groups differ in their patterns…

Descriptors: Foreign Countries, Test Bias, Test Items, Educational Testing

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Impact of Missing Data on the Detection of Differential Item Functioning: The Case of Mantel-Haenszel and Logistic Regression Analysis

Peer reviewed

Direct link

Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009

This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…

Descriptors: Test Bias, Simulation, Interaction, Effect Size

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Detecting Differential Speededness in Multistage Testing

Peer reviewed

Direct link

van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007

A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…

Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time

Quantitative Methods Used in the Study of Item Bias.

Hills, John R. – 1984

The literature on item bias, i.e., the question of whether some items in tests favor one cultural group over another cultural group due to irrelevant factors, is reviewed and evaluated. All known references through 1981 are described including a large number of unpublished reports. Each method is described and the criticisms that have appeared in…

Descriptors: Evaluation Methods, Item Analysis, Racial Differences, Test Bias

"If You Can't Say Something Nice": A Variation on the Social Desirability Response Set.

Peer reviewed

Johanson, George A.; And Others – Evaluation Review, 1993

The tendency of some respondents to omit items more often when they feel they have a less positive evaluation to make and less frequently when the evaluation is more positive is discussed. Five examples illustrate this form of nonresponse bias. Recommendations to overcome nonresponse bias are offered. (SLD)

Descriptors: Estimation (Mathematics), Evaluation Methods, Questionnaires, Response Style (Tests)

Randomised Items in Computer-Based Tests: Russian Roulette in Assessment?

Peer reviewed

Direct link

Marks, Anthony M.; Cronje, Johannes C. – Educational Technology & Society, 2008

Computer-based assessments are becoming more commonplace, perhaps as a necessity for faculty to cope with large class sizes. These tests often occur in large computer testing venues in which test security may be compromised. In an attempt to limit the likelihood of cheating in such venues, randomised presentation of items is automatically…

Descriptors: Educational Assessment, Educational Testing, Research Needs, Test Items

An Empirical Investigation of the Applicability of Multiple Matrix Sampling to the Method of Rank Order.

Peer reviewed

Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982

Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)

Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction

Methods of Assessing Bias and Fairness in Tests.

Merz, William R. – 1980

Several methods of assessing test item bias are described, and the concept of fair use of tests is examined. A test item is biased if individuals of equal ability have different probabilities of attaining the item correct. The following seven general procedures used to examine test items for bias are summarized and discussed: (1) analysis of…

Descriptors: Comparative Analysis, Evaluation Methods, Factor Analysis, Mathematical Models

Previous Page | Next Page »

Pages: 1 | 2

Childs, Ruth A.	2
Askegaard, Lewis D.	1
Babcock, Ben	1
Bhaskar, R.	1
Breithaupt, Krista	1
Camilli, Gregory	1
Chuah, Siang Chee	1
Cronje, Johannes C.	1
Cui, Ying	1
Dillard, Jesse F.	1
Eli, Jennifer A.	1
Emenogu, Barnabas C.	1
Falenchuk, Olesya	1
Fu, Jianbin	1
Gattamorta, Karina	1
Green, Kinsey B.	1
Hall, Jane	1
Hambleton, Ronald K.	1
Henning, Grant	1
Hills, John R.	1
Johanson, George A.	1
Jong, Cindy	1
Kim, Ok-Kyeong	1
Kreeft, Henk	1
More ▼