ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	8

Descriptor

Evaluation Methods	11
Item Analysis	11
Probability	11
Item Response Theory	5
Models	4
Correlation	3
Error of Measurement	3
Psychometrics	3
Sample Size	3
Statistical Analysis	3
Test Items	3
Computer Software	2
Data Analysis	2
Foreign Countries	2
Measurement	2
Older Adults	2
Responses	2
Simulation	2
Test Construction	2
Adaptive Testing	1
Admission (School)	1
Bayesian Statistics	1
College Entrance Examinations	1
College Students	1
Comparative Analysis	1
More ▼

Source

Journal of Educational and…	2
Applied Measurement in…	1
Applied Psychological…	1
Educational and Psychological…	1
International Journal of…	1
International Society for…	1
Physical Review Physics…	1
Psychometrika	1

Publication Type

Journal Articles	8
Reports - Research	8
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Adult Education	1
Higher Education	1
Postsecondary Education	1

Audience

Location

Denmark	1
Italy	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Comparison of Two Exam Evaluation Methods for Objectivity

Peer reviewed
PDF on ERIC

Download full text

Kolarec, Biserka; Nincevic, Marina – International Society for Technology, Education, and Science, 2022

The object of research is a statistics exam that contains problem tasks. One examiner performed two exam evaluation methods to repeatedly evaluate the exam. The goal was to compare the methods for objectivity. One of the two exam evaluation methods we call a serial evaluation method. The serial evaluation method assumes evaluation of all exam…

Descriptors: Statistics Education, Mathematics Tests, Evaluation Methods, Test Construction

Forced-Choice Ranking Models for Raters' Ranking Data

Peer reviewed

Direct link

Hung, Su-Pin; Huang, Hung-Yu – Journal of Educational and Behavioral Statistics, 2022

To address response style or bias in rating scales, forced-choice items are often used to request that respondents rank their attitudes or preferences among a limited set of options. The rating scales used by raters to render judgments on ratees' performance also contribute to rater bias or errors; consequently, forced-choice items have recently…

Descriptors: Evaluation Methods, Rating Scales, Item Analysis, Preferences

Motivations for Using the Item Response Theory Nominal Response Model to Rank Responses to Multiple-Choice Items

Peer reviewed

Direct link

Smith, Trevor I.; Bendjilali, Nasrine – Physical Review Physics Education Research, 2022

Several recent studies have employed item response theory (IRT) to rank incorrect responses to commonly used research-based multiple-choice assessments. These studies use Bock's nominal response model (NRM) for applying IRT to categorical (nondichotomous) data, but the response rankings only utilize half of the parameters estimated by the model.…

Descriptors: Item Response Theory, Test Items, Multiple Choice Tests, Science Tests

Evaluation of Two Types of Differential Item Functioning in Factor Mixture Models with Binary Outcomes

Peer reviewed

Direct link

Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014

Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…

Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Dimensionality of the Latent Structure and Item Selection via Latent Class Multidimensional IRT Models

Peer reviewed

Direct link

Bartolucci, F.; Montanari, G. E.; Pandolfi, S. – Psychometrika, 2012

With reference to a questionnaire aimed at assessing the performance of Italian nursing homes on the basis of the health conditions of their patients, we investigate two relevant issues: dimensionality of the latent structure and discriminating power of the items composing the questionnaire. The approach is based on a multidimensional item…

Descriptors: Foreign Countries, Probability, Item Analysis, Test Items

A Note on Item-Restscore Association in Rasch Models

Peer reviewed

Direct link

Kreiner, Svend – Applied Psychological Measurement, 2011

To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…

Descriptors: Item Analysis, Correlation, Item Response Theory, Models

Conditional Item-Exposure Control in Adaptive Testing Using Item-Ineligibility Probabilities

Peer reviewed

Direct link

van der Linden, Wim J.; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2007

Two conditional versions of the exposure-control method with item-ineligibility constraints for adaptive testing in van der Linden and Veldkamp (2004) are presented. The first version is for unconstrained item selection, the second for item selection with content constraints imposed by the shadow-test approach. In both versions, the exposure rates…

Descriptors: Law Schools, Adaptive Testing, Item Analysis, Probability

The Efficiency Index in Item Analysis.

Download full text

Hoffman, Richard J. – 1972

In this paper, a new item analysis index, e, is derived as a function of difficulty and discrimination to represent item efficiency. Item discrimination in this paper is not independent of item difficulty, and it is demonstrated algebraically that the maximum discriminating power of an item may be determined from its difficulty. Item efficiency is…

Descriptors: Evaluation Methods, Item Analysis, Mathematical Concepts, Probability

ITEM SAMPLING IN EDUCATIONAL RESEARCH.

Download full text

HUSEK, T.R.; SIROTNIK, KEN – 1967

A DESCRIPTION IS GIVEN OF ITEM SAMPLING, OR "PSYCHOMETRIC-STATISTICAL INFERENCE," AN APPROACH TO GATHERING AND USING EDUCATIONAL DATA THAT ALLOWS STATISTICAL INFERENCES TO BE MADE SIMULTANEOUSLY WITH PSYCHOMETRIC INFERENCES. THIS IS A PROCEDURE IN WHICH BOTH PEOPLE AND ITEMS ARE SAMPLED AND THE DATA FROM A SAMPLE OF PEOPLE TAKING A SAMPLE OF ITEMS…

Descriptors: Data Analysis, Data Collection, Educational Research, Evaluation Methods

Multiple Evaluation: A New Testing Paradigm that Exorcizes Guessing

Peer reviewed

Direct link

Dirkzwager, Arie – International Journal of Testing, 2003

The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…

Descriptors: Psychometrics, Probability, Models, Measurement

Bartolucci, F.	1
Bendjilali, Nasrine	1
Beretvas, S. Natasha	1
Dirkzwager, Arie	1
HUSEK, T.R.	1
Hoffman, Richard J.	1
Huang, Hung-Yu	1
Hung, Su-Pin	1
Kolarec, Biserka	1
Kreiner, Svend	1
Lee, HwaYoung	1
Montanari, G. E.	1
Nincevic, Marina	1
Pandolfi, S.	1
Phillips, Gary W.	1
SIROTNIK, KEN	1
Smith, Trevor I.	1
Veldkamp, Bernard P.	1
van der Linden, Wim J.	1
More ▼