ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	18

Descriptor

Mathematics Tests	24
Item Response Theory	20
Test Items	15
Grade 8	6
Difficulty Level	5
Foreign Countries	5
Multiple Choice Tests	5
Reading Tests	5
Constructed Response	4
High School Students	4
Test Format	4
Comparative Analysis	3
Computation	3
Equated Scores	3
Grade 4	3
High Schools	3
Item Analysis	3
Mathematics Achievement	3
Measurement	3
Reliability	3
Science Tests	3
Sex Differences	3
Test Bias	3
Test Construction	3
Validity	3
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	24
Reports - Research	21
Reports - Evaluative	3
Speeches/Meeting Papers	2

Education Level

Elementary Education	8
Secondary Education	8
Junior High Schools	7
Middle Schools	7
Grade 8	6
Elementary Secondary Education	3
Grade 4	3
Grade 7	3
Grade 3	2
Grade 6	2
Intermediate Grades	2
Early Childhood Education	1
Grade 1	1
Grade 10	1
Grade 11	1
Grade 2	1
Grade 5	1
Grade 9	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Primary Education	1
More ▼

Audience

Location

Netherlands	2
Belgium	1
Finland	1
Germany	1
Iran (Tehran)	1
Italy	1
Romania	1
Russia	1
United Kingdom (Northern…	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	2
Progress in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Can Adaptive Testing Improve Test-Taking Experience? A Case Study on Educational Survey Assessment

Peer reviewed

Direct link

Yi-Hsuan Lee; Yue Jia – Applied Measurement in Education, 2024

Test-taking experience is a consequence of the interaction between students and assessment properties. We define a new notion, rapid-pacing behavior, to reflect two types of test-taking experience -- disengagement and speededness. To identify rapid-pacing behavior, we extend existing methods to develop response-time thresholds for individual items…

Descriptors: Adaptive Testing, Reaction Time, Item Response Theory, Test Format

Examining Three Learning Progressions in Middle-School Mathematics for Formative Assessment

Peer reviewed

Direct link

Pham, Duy N.; Wells, Craig S.; Bauer, Malcolm I.; Wylie, E. Caroline; Monroe, Scott – Applied Measurement in Education, 2021

Assessments built on a theory of learning progressions are promising formative tools to support learning and teaching. The quality and usefulness of those assessments depend, in large part, on the validity of the theory-informed inferences about student learning made from the assessment results. In this study, we introduced an approach to address…

Descriptors: Formative Evaluation, Mathematics Instruction, Mathematics Achievement, Middle School Students

The Impact of Three Factors on the Recovery of Item Parameters for the Three-Parameter Logistic Model

Peer reviewed

Direct link

Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017

This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…

Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods

Examinee Non-Effort on Contextualized and Non-Contextualized Mathematics Items in Large-Scale Assessments

Peer reviewed

Direct link

Nijlen, Daniel Van; Janssen, Rianne – Applied Measurement in Education, 2015

In this study it is investigated to what extent contextualized and non-contextualized mathematics test items have a differential impact on examinee effort. Mixture item response theory (IRT) models are applied to two subsets of items from a national assessment on mathematics in the second grade of the pre-vocational track in secondary education in…

Descriptors: Mathematics Tests, Measurement, Item Response Theory, Test Items

Differential Item Functioning for Accommodated Students with Disabilities: Effect of Differences in Proficiency Distributions

Peer reviewed

Direct link

Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019

This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…

Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory

Exploring the Robustness of a Unidimensional Item Response Theory Model with Empirically Multidimensional Data

Peer reviewed

Direct link

Anderson, Daniel; Kahn, Joshua D.; Tindal, Gerald – Applied Measurement in Education, 2017

Unidimensionality and local independence are two common assumptions of item response theory. The former implies that all items measure a common latent trait, while the latter implies that responses are independent, conditional on respondents' location on the latent trait. Yet, few tests are truly unidimensional. Unmodeled dimensions may result in…

Descriptors: Robustness (Statistics), Item Response Theory, Mathematics Tests, Grade 6

Negative Keying Effects in the Factor Structure of TIMSS 2011 Motivation Scales and Associations with Reading Achievement

Peer reviewed

Direct link

Michaelides, Michalis P. – Applied Measurement in Education, 2019

The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…

Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests

An Empirical Comparison of DDF Detection Methods for Understanding the Causes of DIF in Multiple-Choice Items

Peer reviewed

Direct link

Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015

This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…

Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods

Diagnosing Competency Mastery in Science: An Application of GDM to TIMSS 2011 Data

Peer reviewed

Direct link

Kabiri, Masoud; Ghazi-Tabatabaei, Mahmood; Bazargan, Abbas; Shokoohi-Yekta, Mohsen; Kharrazi, Kamal – Applied Measurement in Education, 2017

Numerous diagnostic studies have been conducted on large-scale assessments to illustrate the students' mastery profile in the areas of math and reading; however, for science a limited number of investigations are reported. This study investigated Iranian eighth graders' competency mastery of science and examined the utility of the General…

Descriptors: Elementary Secondary Education, Achievement Tests, International Assessment, Foreign Countries

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

The Effect of Small Group Discussion on Cutoff Scores during Standard Setting

Peer reviewed

Direct link

Deunk, Marjolein I.; van Kuijk, Mechteld F.; Bosker, Roel J. – Applied Measurement in Education, 2014

Standard setting methods, like the Bookmark procedure, are used to assist education experts in formulating performance standards. Small group discussion is meant to help these experts in setting more reliable and valid cutoff scores. This study is an analysis of 15 small group discussions during two standards setting trajectories and their effect…

Descriptors: Cutting Scores, Standard Setting, Group Discussion, Reading Tests

The Language Factor in Elementary Mathematics Assessments: Computational Skills and Applied Problem Solving in a Multidimensional IRT Framework

Peer reviewed

Direct link

Hickendorff, Marian – Applied Measurement in Education, 2013

The results of an exploratory study into measurement of elementary mathematics ability are presented. The focus is on the abilities involved in solving standard computation problems on the one hand and problems presented in a realistic context on the other. The objectives were to assess to what extent these abilities are shared or distinct, and…

Descriptors: Elementary School Mathematics, Mathematics Tests, Mathematics Skills, Problem Solving

Providing Subscale Scores for Diagnostic Information: A Case Study when the Test Is Essentially Unidimensional

Peer reviewed

Direct link

Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010

Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…

Descriptors: Item Response Theory, Case Studies, Reliability, Scores

Examining the Effectiveness of Test Accommodation Using DIF and a Mixture IRT Model

Peer reviewed

Direct link

Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012

This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…

Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity

Previous Page | Next Page »

Pages: 1 | 2

Lane, Suzanne	4
DeMars, Christine E.	2
Michaelides, Michalis P.	2
Stone, Clement A.	2
Anderson, Daniel	1
Bauer, Malcolm I.	1
Bazargan, Abbas	1
Bosker, Roel J.	1
Brian F. French	1
Cho, Hyun-Jeong	1
Deunk, Marjolein I.	1
Engelhard, George, Jr.	1
Fitzpatrick, Anne R.	1
Garner, Mary	1
Ghazi-Tabatabaei, Mahmood	1
Haertel, Edward H.	1
Hickendorff, Marian	1
Ito, Kyoko	1
Janssen, Rianne	1
Kabiri, Masoud	1
Kahn, Joshua D.	1
Kharrazi, Kamal	1
Kim, Kyung Yong	1
Kingston, Neal	1
More ▼