Publication Date
In 2025 | 3 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 42 |
Since 2016 (last 10 years) | 82 |
Since 2006 (last 20 years) | 118 |
Descriptor
Correlation | 152 |
Item Analysis | 152 |
Test Items | 152 |
Foreign Countries | 50 |
Difficulty Level | 42 |
Item Response Theory | 34 |
Scores | 32 |
Test Construction | 32 |
Statistical Analysis | 29 |
Test Validity | 29 |
Second Language Learning | 28 |
More ▼ |
Source
Author
Aryadoust, Vahid | 3 |
Reckase, Mark D. | 3 |
Vegelius, Jan | 3 |
Allan S. Cohen | 2 |
Benjamin W. Domingue | 2 |
Circi, Ruhan | 2 |
Gierl, Mark J. | 2 |
Hassan, Aalaa Yaseen | 2 |
Joshua B. Gilbert | 2 |
Kelecioglu, Hülya | 2 |
Kobrin, Jennifer L. | 2 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 7 |
Practitioners | 2 |
Students | 1 |
Location
Turkey | 7 |
Canada | 6 |
Germany | 3 |
Japan | 3 |
South Korea | 3 |
United Kingdom (England) | 3 |
Indonesia | 2 |
Iran | 2 |
Switzerland | 2 |
United States | 2 |
China | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025
This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…
Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis
Metsämuuronen, Jari – International Journal of Educational Methodology, 2021
Although Goodman-Kruskal gamma (G) is used relatively rarely it has promising potential as a coefficient of association in educational settings. Characteristics of G are studied in three sub-studies related to educational measurement settings. G appears to be unexpectedly appealing as an estimator of association between an item and a score because…
Descriptors: Educational Assessment, Measurement, Item Analysis, Correlation
Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023
Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…
Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Krishna Mohan Surapaneni; Anusha Rajajagadeesan; Lakshmi Goudhaman; Shalini Lakshmanan; Saranya Sundaramoorthi; Dineshkumar Ravi; Kalaiselvi Rajendiran; Porchelvan Swaminathan – Biochemistry and Molecular Biology Education, 2024
The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an…
Descriptors: Biochemistry, Science Instruction, Artificial Intelligence, Teaching Methods
Kilic, Abdullah Faruk; Uysal, Ibrahim – International Journal of Assessment Tools in Education, 2022
Most researchers investigate the corrected item-total correlation of items when analyzing item discrimination in multi-dimensional structures under the Classical Test Theory, which might lead to underestimating item discrimination, thereby removing items from the test. Researchers might investigate the corrected item-total correlation with the…
Descriptors: Item Analysis, Correlation, Item Response Theory, Test Items
Jessica Röhner; Philipp Thoss; Liad Uziel – Educational and Psychological Measurement, 2024
According to faking models, personality variables and faking are related. Most prominently, people's tendency to try to make an appropriate impression (impression management; IM) and their tendency to adjust the impression they make (self-monitoring; SM) have been suggested to be associated with faking. Nevertheless, empirical findings connecting…
Descriptors: Metacognition, Deception, Personality Traits, Scores
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
An Analysis of Differential Bundle Functioning in Multidimensional Tests Using the SIBTEST Procedure
Özdogan, Didem; Kelecioglu, Hülya – International Journal of Assessment Tools in Education, 2022
This study aims to analyze the differential bundle functioning in multidimensional tests with a specific purpose to detect this effect through differentiating the location of the item with DIF in the test, the correlation between the dimensions, the sample size, and the ratio of reference to focal group size. The first 10 items of the test that is…
Descriptors: Correlation, Sample Size, Test Items, Item Analysis
Ahmet Yildirim; Nizamettin Koç – International Journal of Assessment Tools in Education, 2024
The present research aims to examine whether the questions in the Program for the International Student Assessment (PISA) 2009 reading literacy instrument display differential item functioning (DIF) among the Turkish, French, and American samples based on univariate and multivariate matching techniques before and after the total score, which is…
Descriptors: Test Items, Item Analysis, Correlation, Error of Measurement
Domingue, Benjamin W.; Kanopka, Klint; Stenhaug, Ben; Sulik, Michael J.; Beverly, Tanesia; Brinkhuis, Matthieu; Circi, Ruhan; Faul, Jessica; Liao, Dandan; McCandliss, Bruce; Obradovic, Jelena; Piech, Chris; Porter, Tenelle; Soland, James; Weeks, Jon; Wise, Steven L.; Yeatman, Jason – Journal of Educational and Behavioral Statistics, 2022
The speed-accuracy trade-off (SAT) suggests that time constraints reduce response accuracy. Its relevance in observational settings--where response time (RT) may not be constrained but respondent speed may still vary--is unclear. Using 29 data sets containing data from cognitive tasks, we use a flexible method for identification of the SAT (which…
Descriptors: Accuracy, Reaction Time, Task Analysis, College Entrance Examinations
Alallo, Hajir Mahmood Ibrahim; Mohammed, Aisha; Hamid, Zayad Khalaf; Hassan, Aalaa Yaseen; Kadhim, Qasim Khlaif – International Journal of Language Testing, 2023
Diagnostic classification models (DCMs) have recently become very popular both for research purposes and for real testing endeavors for student assessment. A plethora of DCM models give researchers and practitioners a wide range of options for student diagnosis and classification. One intriguing option that some DCM models offer is the possibility…
Descriptors: Language Tests, Diagnostic Tests, Classification, Clinical Diagnosis
Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…
Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification
Marli Crabtree; Kenneth L. Thompson; Ellen M. Robertson – HAPS Educator, 2024
Research has suggested that changing one's answer on multiple-choice examinations is more likely to lead to positive academic outcomes. This study aimed to further understand the relationship between changing answer selections and item attributes, student performance, and time within a population of 158 first-year medical students enrolled in a…
Descriptors: Anatomy, Science Tests, Medical Students, Medical Education