Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Data | 8 |
| Artificial Intelligence | 2 |
| Item Response Theory | 2 |
| Judges | 2 |
| Models | 2 |
| Standard Setting (Scoring) | 2 |
| Tests | 2 |
| Accuracy | 1 |
| Automation | 1 |
| Bayesian Statistics | 1 |
| Cloze Procedure | 1 |
| More ▼ | |
Source
| Educational Measurement:… | 8 |
Author
| Clauser, Brian E. | 2 |
| Haberman, Shelby J. | 2 |
| Margolis, Melissa J. | 2 |
| Sinharay, Sandip | 2 |
| Ackerman, Terry A. | 1 |
| Cui, Zhongmin | 1 |
| Guher Gorgun | 1 |
| Luecht, Richard | 1 |
| Mee, Janet | 1 |
| Okan Bulut | 1 |
| Puhan, Gautam | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 8 |
| Reports - Research | 4 |
| Reports - Descriptive | 3 |
| Opinion Papers | 1 |
| Reports - Evaluative | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Silvia Testa; Renato Miceli; Renato Miceli – Educational Measurement: Issues and Practice, 2025
Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE--that may only be used when the individuals taking the two tests come from the same…
Descriptors: Comparative Testing, Heuristics, Problem Solving, Personality Traits
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Cui, Zhongmin – Educational Measurement: Issues and Practice, 2021
Commonly used machine learning applications seem to relate to big data. This article provides a gentle review of machine learning and shows why machine learning can be applied to small data too. An example of applying machine learning to screen irregularity reports is presented. In the example, the support vector machine and multinomial naïve…
Descriptors: Artificial Intelligence, Man Machine Systems, Data, Bayesian Statistics
Luecht, Richard; Ackerman, Terry A. – Educational Measurement: Issues and Practice, 2018
Simulation studies are extremely common in the item response theory (IRT) research literature. This article presents a didactic discussion of "truth" and "error" in IRT-based simulation studies. We ultimately recommend that future research focus less on the simple recovery of parameters from a convenient generating IRT model,…
Descriptors: Item Response Theory, Simulation, Ethics, Error of Measurement
Sinharay, Sandip; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2014
Standard 3.9 of the Standards for Educational and Psychological Testing ([, 1999]) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ([Hambleton, R. K., 2005]) and Sinharay ([Sinharay, S., 2005]) recommended the assessment of practical significance of misfit of IRT models, but…
Descriptors: Item Response Theory, Goodness of Fit, Models, Tests
Margolis, Melissa J.; Clauser, Brian E. – Educational Measurement: Issues and Practice, 2014
This research evaluated the impact of a common modification to Angoff standard-setting exercises: the provision of examinee performance data. Data from 18 independent standard-setting panels across three different medical licensing examinations were examined to investigate whether and how the provision of performance information impacted judgments…
Descriptors: Cutting Scores, Standard Setting (Scoring), Data, Licensing Examinations (Professions)
Mee, Janet; Clauser, Brian E.; Margolis, Melissa J. – Educational Measurement: Issues and Practice, 2013
Despite being widely used and frequently studied, the Angoff standard setting procedure has received little attention with respect to an integral part of the process: how judges incorporate examinee performance data in the decision-making process. Without performance data, subject matter experts have considerable difficulty accurately making the…
Descriptors: Standard Setting (Scoring), Judges, Data, Decision Making
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2011
The purpose of this ITEMS module is to provide an introduction to subscores. First, examples of subscores from an operational test are provided. Then, a review of methods that can be used to examine if subscores have adequate psychometric quality is provided. It is demonstrated, using results from operational and simulated data, that subscores…
Descriptors: Scores, Psychometrics, Tests, Data

Peer reviewed
Direct link
