Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Evaluation Methods | 11 |
| Item Analysis | 11 |
| Probability | 11 |
| Item Response Theory | 5 |
| Models | 4 |
| Correlation | 3 |
| Error of Measurement | 3 |
| Psychometrics | 3 |
| Sample Size | 3 |
| Statistical Analysis | 3 |
| Test Items | 3 |
| More ▼ | |
Source
| Journal of Educational and… | 2 |
| Applied Measurement in… | 1 |
| Applied Psychological… | 1 |
| Educational and Psychological… | 1 |
| International Journal of… | 1 |
| International Society for… | 1 |
| Physical Review Physics… | 1 |
| Psychometrika | 1 |
Author
| Bartolucci, F. | 1 |
| Bendjilali, Nasrine | 1 |
| Beretvas, S. Natasha | 1 |
| Dirkzwager, Arie | 1 |
| HUSEK, T.R. | 1 |
| Hoffman, Richard J. | 1 |
| Huang, Hung-Yu | 1 |
| Hung, Su-Pin | 1 |
| Kolarec, Biserka | 1 |
| Kreiner, Svend | 1 |
| Lee, HwaYoung | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 8 |
| Reports - Research | 8 |
| Reports - Descriptive | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Adult Education | 1 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kolarec, Biserka; Nincevic, Marina – International Society for Technology, Education, and Science, 2022
The object of research is a statistics exam that contains problem tasks. One examiner performed two exam evaluation methods to repeatedly evaluate the exam. The goal was to compare the methods for objectivity. One of the two exam evaluation methods we call a serial evaluation method. The serial evaluation method assumes evaluation of all exam…
Descriptors: Statistics Education, Mathematics Tests, Evaluation Methods, Test Construction
Hung, Su-Pin; Huang, Hung-Yu – Journal of Educational and Behavioral Statistics, 2022
To address response style or bias in rating scales, forced-choice items are often used to request that respondents rank their attitudes or preferences among a limited set of options. The rating scales used by raters to render judgments on ratees' performance also contribute to rater bias or errors; consequently, forced-choice items have recently…
Descriptors: Evaluation Methods, Rating Scales, Item Analysis, Preferences
Smith, Trevor I.; Bendjilali, Nasrine – Physical Review Physics Education Research, 2022
Several recent studies have employed item response theory (IRT) to rank incorrect responses to commonly used research-based multiple-choice assessments. These studies use Bock's nominal response model (NRM) for applying IRT to categorical (nondichotomous) data, but the response rankings only utilize half of the parameters estimated by the model.…
Descriptors: Item Response Theory, Test Items, Multiple Choice Tests, Science Tests
Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014
Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…
Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Bartolucci, F.; Montanari, G. E.; Pandolfi, S. – Psychometrika, 2012
With reference to a questionnaire aimed at assessing the performance of Italian nursing homes on the basis of the health conditions of their patients, we investigate two relevant issues: dimensionality of the latent structure and discriminating power of the items composing the questionnaire. The approach is based on a multidimensional item…
Descriptors: Foreign Countries, Probability, Item Analysis, Test Items
Kreiner, Svend – Applied Psychological Measurement, 2011
To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…
Descriptors: Item Analysis, Correlation, Item Response Theory, Models
van der Linden, Wim J.; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2007
Two conditional versions of the exposure-control method with item-ineligibility constraints for adaptive testing in van der Linden and Veldkamp (2004) are presented. The first version is for unconstrained item selection, the second for item selection with content constraints imposed by the shadow-test approach. In both versions, the exposure rates…
Descriptors: Law Schools, Adaptive Testing, Item Analysis, Probability
Hoffman, Richard J. – 1972
In this paper, a new item analysis index, e, is derived as a function of difficulty and discrimination to represent item efficiency. Item discrimination in this paper is not independent of item difficulty, and it is demonstrated algebraically that the maximum discriminating power of an item may be determined from its difficulty. Item efficiency is…
Descriptors: Evaluation Methods, Item Analysis, Mathematical Concepts, Probability
HUSEK, T.R.; SIROTNIK, KEN – 1967
A DESCRIPTION IS GIVEN OF ITEM SAMPLING, OR "PSYCHOMETRIC-STATISTICAL INFERENCE," AN APPROACH TO GATHERING AND USING EDUCATIONAL DATA THAT ALLOWS STATISTICAL INFERENCES TO BE MADE SIMULTANEOUSLY WITH PSYCHOMETRIC INFERENCES. THIS IS A PROCEDURE IN WHICH BOTH PEOPLE AND ITEMS ARE SAMPLED AND THE DATA FROM A SAMPLE OF PEOPLE TAKING A SAMPLE OF ITEMS…
Descriptors: Data Analysis, Data Collection, Educational Research, Evaluation Methods
Dirkzwager, Arie – International Journal of Testing, 2003
The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…
Descriptors: Psychometrics, Probability, Models, Measurement

Peer reviewed
Direct link
