ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	16

Descriptor

Item Analysis	31
Test Items	21
Test Construction	12
Computer Assisted Testing	7
Test Validity	7
Item Response Theory	6
Elementary Secondary Education	5
Item Banks	5
Testing Problems	5
Validity	5
Achievement Tests	4
Latent Trait Theory	4
Microcomputers	4
Models	4
Scores	4
Standards	4
Test Bias	4
Correlation	3
Difficulty Level	3
Educational Testing	3
Measurement Techniques	3
Minority Groups	3
Norm Referenced Tests	3
Scoring	3
Test Wiseness	3
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	31
Reports - Research	13
Opinion Papers	8
Reports - Descriptive	7
Reports - Evaluative	3
Guides - Classroom - Learner	1
Guides - Non-Classroom	1
Information Analyses	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Grade 8	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers	3
Teachers	1

Location

Laws, Policies, & Programs

Every Student Succeeds Act…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

SAT (College Admission Test)	2
College Board Achievement…	1
Program for the International…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

Exploration of Latent Structure in Test Revision and Review Log Data

Peer reviewed

Direct link

Zhang, Susu; Li, Anqi; Wang, Shiyu – Educational Measurement: Issues and Practice, 2023

In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and…

Descriptors: Computer Assisted Testing, Test Construction, Test Wiseness, Test Items

An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination

Peer reviewed

Direct link

Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022

The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…

Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level

An Automated Item Pool Assembly Framework for Maximizing Item Utilization for CAT

Peer reviewed

Direct link

Hwanggyu Lim; Kyung T. Han – Educational Measurement: Issues and Practice, 2024

Computerized adaptive testing (CAT) has gained deserved popularity in the administration of educational and professional assessments, but continues to face test security challenges. To ensure sustained quality assurance and testing integrity, it is imperative to establish and maintain multiple stable item pools that are consistent in terms of…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks

A Machine Learning Approach for the Simultaneous Detection of Preknowledge in Examinees and Items When Both Are Unknown

Peer reviewed

Direct link

Pan, Yiqin; Wollack, James A. – Educational Measurement: Issues and Practice, 2023

Pan and Wollack (PW) proposed a machine learning method to detect compromised items. We extend the work of PW to an approach detecting compromised items and examinees with item preknowledge simultaneously and draw on ideas in ensemble learning to relax several limitations in the work of PW. The suggested approach also provides a confidence score,…

Descriptors: Artificial Intelligence, Prior Learning, Item Analysis, Test Content

Combining Process Information and Item Response Modeling to Estimate Problem-Solving Ability

Peer reviewed

Direct link

Xiao, Yue; Veldkamp, Bernard; Liu, Hongyun – Educational Measurement: Issues and Practice, 2022

The action sequences of respondents in problem-solving tasks reflect rich and detailed information about their performance, including differences in problem-solving ability, even if item scores are equal. It is therefore not sufficient to infer individual problem-solving skills based solely on item scores. This study is a preliminary attempt to…

Descriptors: Problem Solving, Item Response Theory, Scores, Item Analysis

Using OpenAI GPT to Generate Reading Comprehension Items

Peer reviewed

Direct link

Ayfer Sayin; Mark Gierl – Educational Measurement: Issues and Practice, 2024

The purpose of this study is to introduce and evaluate a method for generating reading comprehension items using template-based automatic item generation. To begin, we describe a new model for generating reading comprehension items called the text analysis cognitive model assessing inferential skills across different reading passages. Next, the…

Descriptors: Algorithms, Reading Comprehension, Item Analysis, Man Machine Systems

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

A Special Case of Brennan's Index for Tests That Aim to Select a Limited Number of Students: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Arikan, Serkan; Aybek, Eren Can – Educational Measurement: Issues and Practice, 2022

Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number…

Descriptors: Monte Carlo Methods, Item Analysis, Correlation, Individual Differences

Digital Module 08: Foundations of Operational Item Analysis https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Yoo, Hanwook; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2019

Item analysis is an integral part of operational test development and is typically conducted within two popular statistical frameworks: classical test theory (CTT) and item response theory (IRT). In this digital ITEMS module, Hanwook Yoo and Ronald K. Hambleton provide an accessible overview of operational item analysis approaches within these…

Descriptors: Item Analysis, Item Response Theory, Guidelines, Test Construction

Improving the Measurement of School Climate Using Item Response Theory

Peer reviewed

Direct link

Sarah Lindstrom Johnson; Ray E. Reichenberg; Kathan Shukla; Tracy E. Waasdorp; Catherine P. Bradshaw – Educational Measurement: Issues and Practice, 2019

The U.S. government has become increasingly focused on school climate, as recently evidenced by its inclusion as an accountability indicator in the Every Student Succeeds Act. Yet, there remains considerable variability in both conceptualizing and measuring school climate. To better inform the research and practice related to school climate and…

Descriptors: Item Response Theory, Educational Environment, Accountability, Educational Legislation

Easier Said than Done: Rejoinder on Sijtsma and on Green and Yang

Peer reviewed

Direct link

Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2016

The main points of Sijtsma and Green and Yang in Educational Measurement: Issues and Practice (34, 4) are that reliability, internal consistency, and unidimensionality are distinct and that Cronbach's alpha may be problematic. Neither of these assertions are at odds with Davenport, Davison, Liou, and Love in the same issue. However, many authors…

Descriptors: Educational Assessment, Reliability, Validity, Test Construction

Effect of Content Knowledge on Angoff-Style Standard Setting Judgments

Peer reviewed

Direct link

Margolis, Melissa J.; Mee, Janet; Clauser, Brian E.; Winward, Marcia; Clauser, Jerome C. – Educational Measurement: Issues and Practice, 2016

Evidence to support the credibility of standard setting procedures is a critical part of the validity argument for decisions made based on tests that are used for classification. One area in which there has been limited empirical study is the impact of standard setting judge selection on the resulting cut score. One important issue related to…

Descriptors: Academic Standards, Standard Setting (Scoring), Cutting Scores, Credibility

Teaching Introductory Measurement: Suggestions for What to Include and How to Motivate Students

Peer reviewed

Direct link

Bandalos, Deborah L.; Kopp, Jason P. – Educational Measurement: Issues and Practice, 2012

In this article, we discuss the importance of measurement literacy and some issues encountered in teaching introductory measurement courses. We present results from a survey of introductory measurement instructors, including information about the topics included in such courses and the amount of time spent on each. Topics that were included by the…

Descriptors: Class Activities, Motivation Techniques, Item Analysis, Test Theory

Instructional Sensitivity as a Psychometric Property of Assessments

Peer reviewed

Direct link

Polikoff, Morgan S. – Educational Measurement: Issues and Practice, 2010

Standards-based reform, as codified by the No Child Left Behind Act, relies on the ability of assessments to accurately reflect the learning that takes place in U.S. classrooms. However, this property of assessments--their instructional sensitivity--is rarely, if ever, investigated by test developers, states, or researchers. In this paper, the…

Descriptors: Federal Legislation, Psychometrics, Accountability, Teaching Methods

Screening for Potentially Biased Items in Testing Programs.

Peer reviewed

Hills, John R. – Educational Measurement: Issues and Practice, 1989

Test bias detection methods based on item response theory (IRT) are reviewed. Five such methods are commonly used: (1) equality of item parameters; (2) area between item characteristic curves; (3) sums of squares; (4) pseudo-IRT; and (5) one-parameter-IRT. A table compares these and six newer or less tested methods. (SLD)

Descriptors: Item Analysis, Test Bias, Test Items, Testing Programs

Previous Page | Next Page »

Pages: 1 | 2 | 3

Hambleton, Ronald K.	2
Hsu, Tse-chi	2
Angoff, William H.	1
Arikan, Serkan	1
Aybek, Eren Can	1
Ayfer Sayin	1
Bandalos, Deborah L.	1
Bond, Lloyd	1
Carter, Kathy	1
Catherine P. Bradshaw	1
Clauser, Brian E.	1
Clauser, Jerome C.	1
Davenport, Ernest C.	1
Davison, Mark L.	1
Drasgow, Fritz	1
Gierl, Mark J.	1
Gramenz, Gary W.	1
Hills, John R.	1
Hirsch, Thomas	1
Hiscox, Michael D.	1
Hunka, Stephen M.	1
Hwanggyu Lim	1
Jaeger, Richard M.	1
Johnson, Matthew S.	1
More ▼