ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	6

Descriptor

Comparative Analysis	12
Hypothesis Testing	12
Test Items	12
Difficulty Level	4
Item Response Theory	4
Simulation	4
Test Bias	4
Computer Assisted Testing	3
Error of Measurement	3
Item Analysis	3
Measurement Techniques	3
Models	3
Statistical Analysis	3
Test Construction	3
Content Validity	2
Error Patterns	2
Evaluation Methods	2
Item Banks	2
Sample Size	2
Scores	2
Test Format	2
Testing	2
Academic Ability	1
Academic Achievement	1
Achievement Tests	1
More ▼

Source

Applied Psychological…	2
Journal of Educational…	2
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1
Journal of Speech, Language,…	1
Language Learning & Technology	1
ProQuest LLC	1

Publication Type

Reports - Research	10
Journal Articles	9
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Higher Education

Audience

Researchers

Location

Germany

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 12 results Save | Export

When near Means Related: Evidence from Three Web Survey Experiments on Inter-Item Correlations in Grid Questions

Peer reviewed

Direct link

Silber, Henning; Roßmann, Joss; Gummer, Tobias – International Journal of Social Research Methodology, 2018

In this article, we present the results of three question design experiments on inter-item correlations, which tested a grid design against a single-item design. The first and second experiments examined the inter-item correlations of a set with five and seven items, respectively, and the third experiment examined the impact of the question design…

Descriptors: Foreign Countries, Online Surveys, Experiments, Correlation

Examining Power and Type 1 Error for Step and Item Level Tests of Invariance: Investigating the Effect of the Number of Item Score Levels

Direct link

Ayodele, Alicia Nicole – ProQuest LLC, 2017

Within polytomous items, differential item functioning (DIF) can take on various forms due to the number of response categories. The lack of invariance at this level is referred to as differential step functioning (DSF). The most common DSF methods in the literature are the adjacent category log odds ratio (AC-LOR) estimator and cumulative…

Descriptors: Statistical Analysis, Test Bias, Test Items, Scores

Monitoring Items in Real Time to Enhance CAT Security

Peer reviewed

Direct link

Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016

An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…

Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory

Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Application of the Wald Test to Investigate DIF in the DINA Model

Peer reviewed

Direct link

Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014

Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…

Descriptors: Test Bias, Models, Simulation, Error Patterns

A Comparison of Uniform DIF Effect Size Estimators under the MIMIC and Rasch Models

Peer reviewed

Direct link

Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013

The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…

Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis

Item Selection and Hypothesis Testing for the Adaptive Measurement of Change

Peer reviewed

Direct link

Finkelman, Matthew D.; Weiss, David J.; Kim-Kang, Gyenam – Applied Psychological Measurement, 2010

Assessing individual change is an important topic in both psychological and educational measurement. An adaptive measurement of change (AMC) method had previously been shown to exhibit greater efficiency in detecting change than conventional nonadaptive methods. However, little work had been done to compare different procedures within the AMC…

Descriptors: Computer Assisted Testing, Hypothesis Testing, Measurement, Item Analysis

Differential Item Performance and the Mantel-Haenszel Procedure.

Download full text

Holland, Paul W.; Thayer, Dorothy T. – 1986

The Mantel-Haenszel procedure (MH) is a practical, inexpensive, and powerful way to detect test items that function differently in two groups of examinees. MH is a natural outgrowth of previously suggested chi square methods, and it is also related to methods based on item response theory. The study of items that function differently for two…

Descriptors: Comparative Analysis, Hypothesis Testing, Item Analysis, Latent Trait Theory

Detecting Answer Copying Using the Kappa Statistic

Peer reviewed

Direct link

Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006

A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…

Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

Rasch Modeling of Revised Token Test Performance: Validity and Sensitivity to Change

Peer reviewed

Direct link

Hula, William; Doyle, Patrick J.; McNeil, Malcolm R.; Mikolic, Joseph M. – Journal of Speech, Language, and Hearing Research, 2006

The purpose of this research was to examine the validity of the 55-item Revised Token Test (RTT) and to compare traditional and Rasch-based scores in their ability to detect group differences and change over time. The 55-item RTT was administered to 108 left- and right-hemisphere stroke survivors, and the data were submitted to Rasch analysis.…

Descriptors: Test Items, Brain Hemisphere Functions, Individual Differences, Difficulty Level

Testing L2 Vocabulary Recognition and Recall Using Pictorial and Written Test Items

Peer reviewed

Direct link

Jones, Linda – Language Learning & Technology, 2004

This article describes two studies that examined the effects of pictorial and written annotations on second language (L2) vocabulary learning from a multimedia environment. In both studies, students were randomly assigned to one of four aural multimedia groups: a control group that received no annotations, and three treatment groups that provided…

Descriptors: Control Groups, Test Items, Testing, Vocabulary Development

CMI Unit Test Item Presentation/Feedback and Its Effect on Final Examination Performance: Staff Study.

Peer reviewed

Jelden, D. L. – Journal of Educational Technology Systems, 1988

Reviews study conducted to compare levels of achievement on final exams for college students responding to combinations of test-item feedback methods and modes of test-item presentation. The PHOENIX computer system used in the comparison is described, and the use of ACT (American College Testing Program) scores for ability comparison is discussed.…

Descriptors: Academic Ability, Academic Achievement, Achievement Tests, Analysis of Covariance

Relationships between Test Specifications, Item Responses, Task Demands, and Item Attributes in a Large-Scale Science Assessment.

Download full text

Park, Chung; Allen, Nancy L. – 1994

This study is part of continuing research into the meaning of future National Assessment of Educational Progress (NAEP) science scales. In this study, the test framework, as examined by NAEP's consensus process, and attributes of the items, identified by science experts, cognitive scientists, and measurement specialists, are examined. Preliminary…

Descriptors: Communication (Thought Transfer), Comparative Analysis, Construct Validity, Content Validity

Ahn, Soyeon	1
Allen, Nancy L.	1
Ayodele, Alicia Nicole	1
Doyle, Patrick J.	1
Finkelman, Matthew D.	1
Gummer, Tobias	1
Holland, Paul W.	1
Hou, Likun	1
Hula, William	1
Jelden, D. L.	1
Jin, Ying	1
Jones, Linda	1
Kim-Kang, Gyenam	1
Li, Jie	1
McNeil, Malcolm R.	1
Meijer, Rob R.	1
Mikolic, Joseph M.	1
Myers, Nicholas D.	1
Nandakumar, Ratna	1
Park, Chung	1
Penfield, Randall D.	1
Roßmann, Joss	1
Silber, Henning	1
Sotaridona, Leonardo S.	1
Thayer, Dorothy T.	1
More ▼