ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	17

Descriptor

Item Analysis	17
Item Response Theory	15
Test Items	10
Error of Measurement	6
Models	5
Sample Size	5
Comparative Analysis	4
Difficulty Level	4
Accuracy	3
Guessing (Tests)	3
Mathematics Tests	3
Scores	3
Standardized Tests	3
Test Construction	3
Test Format	3
Test Length	3
Test Reliability	3
Test Validity	3
Testing Programs	3
Computer Assisted Testing	2
Equated Scores	2
Error Patterns	2
Evaluation Methods	2
Evaluators	2
Foreign Countries	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	17
Reports - Research	16
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Secondary Education	5
Elementary Education	2
Grade 4	2
Grade 7	2
Higher Education	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 11	1
Grade 3	1
Grade 5	1
Grade 6	1
High Schools	1
Intermediate Grades	1
Primary Education	1
More ▼

Audience

Location

California	1
Canada	1

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Comparison of Methods for Identifying Differential Step Functioning with Polytomous Item Response Data

Peer reviewed

Direct link

Finch, Holmes – Applied Measurement in Education, 2022

Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…

Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Peer reviewed

Direct link

Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…

Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy

Performance of Infit and Outfit Confidence Intervals Calculated via Parametric Bootstrapping

Peer reviewed

Direct link

Silva Diaz, John Alexander; Köhler, Carmen; Hartig, Johannes – Applied Measurement in Education, 2022

Testing item fit is central in item response theory (IRT) modeling, since a good fit is necessary to draw valid inferences from estimated model parameters. "Infit" and "outfit" fit statistics, widespread indices for detecting deviations from the Rasch model, are affected by data factors, such as sample size. Consequently, the…

Descriptors: Intervals, Item Response Theory, Item Analysis, Inferences

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

Response Demands of Reading Comprehension Test Items: A Review of Item Difficulty Modeling Studies

Peer reviewed

Direct link

Ferrara, Steve; Steedle, Jeffrey T.; Frantz, Roger S. – Applied Measurement in Education, 2022

Item difficulty modeling studies involve (a) hypothesizing item features, or item response demands, that are likely to predict item difficulty with some degree of accuracy; and (b) entering the features as independent variables into a regression equation or other statistical model to predict difficulty. In this review, we report findings from 13…

Descriptors: Reading Comprehension, Reading Tests, Test Items, Item Response Theory

Examining Three Learning Progressions in Middle-School Mathematics for Formative Assessment

Peer reviewed

Direct link

Pham, Duy N.; Wells, Craig S.; Bauer, Malcolm I.; Wylie, E. Caroline; Monroe, Scott – Applied Measurement in Education, 2021

Assessments built on a theory of learning progressions are promising formative tools to support learning and teaching. The quality and usefulness of those assessments depend, in large part, on the validity of the theory-informed inferences about student learning made from the assessment results. In this study, we introduced an approach to address…

Descriptors: Formative Evaluation, Mathematics Instruction, Mathematics Achievement, Middle School Students

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

An IRT Mixture Model for Rating Scale Confusion Associated with Negatively Worded Items in Measures of Social-Emotional Learning

Peer reviewed

Direct link

Bolt, Daniel; Wang, Yang Caroline; Meyer, Robert H.; Pier, Libby – Applied Measurement in Education, 2020

We illustrate the application of mixture IRT models to evaluate respondent confusion due to the negative wording of certain items on a social-emotional learning (SEL) assessment. Using actual student self-report ratings on four social-emotional learning scales collected from students in grades 3-12 from CORE Districts in the state of California,…

Descriptors: Item Response Theory, Social Emotional Learning, Self Evaluation (Individuals), Measurement Techniques

Regression Effects in Angoff Ratings: Examples from Credentialing Exams

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2018

This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…

Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

Peer reviewed

Direct link

Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N. – Applied Measurement in Education, 2013

Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…

Descriptors: Test Format, Test Items, Item Analysis, Goodness of Fit

The Effect of Changing Content on IRT Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015

Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…

Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics

Considering the Use of General and Modified Assessment Items in Computerized Adaptive Testing

Peer reviewed

Direct link

Wyse, Adam E.; Albano, Anthony D. – Applied Measurement in Education, 2015

This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Testing Programs

Previous Page | Next Page »

Pages: 1 | 2

Lee, Won-Chan	2
Wyse, Adam E.	2
Abu-Ghazalah, Rashid M.	1
Abulela, Mohammed A. A.	1
Albano, Anthony D.	1
Ansley, Timothy N.	1
Bauer, Malcolm I.	1
Bolt, Daniel	1
Brian F. French	1
Chon, Kyong Hee	1
Dubins, David N.	1
Ferrara, Steve	1
Finch, Holmes	1
Frantz, Roger S.	1
Hartig, Johannes	1
Keller, Lisa A.	1
Keller, Robert R.	1
Köhler, Carmen	1
Lee, Yoonsun	1
Lixin Yuan	1
Meyer, Robert H.	1
Minqiang Zhang	1
Monroe, Scott	1
Pham, Duy N.	1
Phillips, Gary W.	1
More ▼