Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 18 |
Descriptor
Mathematics Tests | 24 |
Item Response Theory | 20 |
Test Items | 15 |
Grade 8 | 6 |
Difficulty Level | 5 |
Foreign Countries | 5 |
Multiple Choice Tests | 5 |
Reading Tests | 5 |
Constructed Response | 4 |
High School Students | 4 |
Test Format | 4 |
More ▼ |
Source
Applied Measurement in… | 24 |
Author
Publication Type
Journal Articles | 24 |
Reports - Research | 21 |
Reports - Evaluative | 3 |
Speeches/Meeting Papers | 2 |
Education Level
Elementary Education | 8 |
Secondary Education | 8 |
Junior High Schools | 7 |
Middle Schools | 7 |
Grade 8 | 6 |
Elementary Secondary Education | 3 |
Grade 4 | 3 |
Grade 7 | 3 |
Grade 3 | 2 |
Grade 6 | 2 |
Intermediate Grades | 2 |
More ▼ |
Audience
Location
Netherlands | 2 |
Belgium | 1 |
Finland | 1 |
Germany | 1 |
Iran (Tehran) | 1 |
Italy | 1 |
Romania | 1 |
Russia | 1 |
United Kingdom (Northern… | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 2 |
Progress in International… | 1 |
What Works Clearinghouse Rating
Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data
Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024
Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…
Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests
Yi-Hsuan Lee; Yue Jia – Applied Measurement in Education, 2024
Test-taking experience is a consequence of the interaction between students and assessment properties. We define a new notion, rapid-pacing behavior, to reflect two types of test-taking experience -- disengagement and speededness. To identify rapid-pacing behavior, we extend existing methods to develop response-time thresholds for individual items…
Descriptors: Adaptive Testing, Reaction Time, Item Response Theory, Test Format
Pham, Duy N.; Wells, Craig S.; Bauer, Malcolm I.; Wylie, E. Caroline; Monroe, Scott – Applied Measurement in Education, 2021
Assessments built on a theory of learning progressions are promising formative tools to support learning and teaching. The quality and usefulness of those assessments depend, in large part, on the validity of the theory-informed inferences about student learning made from the assessment results. In this study, we introduced an approach to address…
Descriptors: Formative Evaluation, Mathematics Instruction, Mathematics Achievement, Middle School Students
Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017
This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…
Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods
Nijlen, Daniel Van; Janssen, Rianne – Applied Measurement in Education, 2015
In this study it is investigated to what extent contextualized and non-contextualized mathematics test items have a differential impact on examinee effort. Mixture item response theory (IRT) models are applied to two subsets of items from a national assessment on mathematics in the second grade of the pre-vocational track in secondary education in…
Descriptors: Mathematics Tests, Measurement, Item Response Theory, Test Items
Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019
This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…
Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory
Anderson, Daniel; Kahn, Joshua D.; Tindal, Gerald – Applied Measurement in Education, 2017
Unidimensionality and local independence are two common assumptions of item response theory. The former implies that all items measure a common latent trait, while the latter implies that responses are independent, conditional on respondents' location on the latent trait. Yet, few tests are truly unidimensional. Unmodeled dimensions may result in…
Descriptors: Robustness (Statistics), Item Response Theory, Mathematics Tests, Grade 6
Michaelides, Michalis P. – Applied Measurement in Education, 2019
The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…
Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests
Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015
This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…
Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods
Kabiri, Masoud; Ghazi-Tabatabaei, Mahmood; Bazargan, Abbas; Shokoohi-Yekta, Mohsen; Kharrazi, Kamal – Applied Measurement in Education, 2017
Numerous diagnostic studies have been conducted on large-scale assessments to illustrate the students' mastery profile in the areas of math and reading; however, for science a limited number of investigations are reported. This study investigated Iranian eighth graders' competency mastery of science and examined the utility of the General…
Descriptors: Elementary Secondary Education, Achievement Tests, International Assessment, Foreign Countries
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Deunk, Marjolein I.; van Kuijk, Mechteld F.; Bosker, Roel J. – Applied Measurement in Education, 2014
Standard setting methods, like the Bookmark procedure, are used to assist education experts in formulating performance standards. Small group discussion is meant to help these experts in setting more reliable and valid cutoff scores. This study is an analysis of 15 small group discussions during two standards setting trajectories and their effect…
Descriptors: Cutting Scores, Standard Setting, Group Discussion, Reading Tests
Hickendorff, Marian – Applied Measurement in Education, 2013
The results of an exploratory study into measurement of elementary mathematics ability are presented. The focus is on the abilities involved in solving standard computation problems on the one hand and problems presented in a realistic context on the other. The objectives were to assess to what extent these abilities are shared or distinct, and…
Descriptors: Elementary School Mathematics, Mathematics Tests, Mathematics Skills, Problem Solving
Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010
Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…
Descriptors: Item Response Theory, Case Studies, Reliability, Scores
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Previous Page | Next Page ยป
Pages: 1 | 2