ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	6

Descriptor

Comparative Analysis	9
Test Format	9
Item Response Theory	4
Mathematics Tests	3
Test Items	3
Computer Assisted Testing	2
High School Students	2
High Schools	2
Monte Carlo Methods	2
Scores	2
Ability	1
Accuracy	1
Adaptive Testing	1
Adolescents	1
Automation	1
College Entrance Examinations	1
Competition	1
Correlation	1
Differences	1
Difficulty Level	1
Elementary School Students	1
Error of Measurement	1
Essay Tests	1
Females	1
Foreign Countries	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	9
Reports - Research	6
Reports - Evaluative	3

Education Level

Grade 11	1
Grade 8	1

Audience

Location

Spain	1
Texas	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S. – Applied Measurement in Education, 2015

Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…

Descriptors: Nonparametric Statistics, Goodness of Fit, Item Response Theory, Test Format

Gender Differences in Large-Scale Math Assessments: PISA Trend 2000 and 2003

Peer reviewed

Direct link

Liu, Ou Lydia; Wilson, Mark – Applied Measurement in Education, 2009

Many efforts have been made to determine and explain differential gender performance on large-scale mathematics assessments. A well-agreed-on conclusion is that gender differences are contextualized and vary across math domains. This study investigated the pattern of gender differences by item domain (e.g., Space and Shape, Quantity) and item type…

Descriptors: Gender Differences, Mathematics Tests, Measurement, Test Format

Item-Level Comparative Analysis of Online and Paper Administrations of the Texas Assessment of Knowledge and Skills

Peer reviewed

Direct link

Keng, Leslie; McClarty, Katie Larsen; Davis, Laurie Laughlin – Applied Measurement in Education, 2008

This article describes a comparative study conducted at the item level for paper and online administrations of a statewide high stakes assessment. The goal was to identify characteristics of items that may have contributed to mode effects. Item-level analyses compared two modes of the Texas Assessment of Knowledge and Skills (TAKS) for up to four…

Descriptors: Computer Assisted Testing, Geometric Concepts, Grade 8, Comparative Analysis

Robustness to Format Effects of IRT Linking Methods for Mixed-Format Tests

Peer reviewed

Direct link

Kim, Seonghoon; Kolen, Michael J. – Applied Measurement in Education, 2006

Four item response theory linking methods (2 moment methods and 2 characteristic curve methods) were compared to concurrent (CO) calibration with the focus on the degree of robustness to format effects (FEs) when applying the methods to multidimensional data that reflected the FEs associated with mixed-format tests. Based on the quantification of…

Descriptors: Item Response Theory, Robustness (Statistics), Test Format, Comparative Analysis

The Effects of Test Difficulty Manipulation in Computerized Adaptive Testing and Self-Adapted Testing.

Peer reviewed

Ponsoda, Vicente; Olea, Julio; Rodriguez, Maria Soledad; Revuelta, Javier – Applied Measurement in Education, 1999

Compared easy and difficult versions of self-adapted tests (SAT) and computerized adapted tests. No significant differences were found among the tests for estimated ability or posttest state anxiety in studies with 187 Spanish high school students, although other significant differences were found. Discusses implications for interpreting test…

Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing

Assessing the Comparability of School Scores across Test Forms That Are Not Parallel.

Peer reviewed

Fitzpatrick, Anne R.; Lee, Guemin; Gao, Furong – Applied Measurement in Education, 2001

Used generalizability theory to assess the variation in school scores across very short test forms that measured mathematics scores in grades 4 and 8. More than 25,000 students took each form of the 3 tests for each grade. Results demonstrate the lack of comparability in school scores across short, nonparallel tests forms and the importance of…

Descriptors: Comparative Analysis, Elementary School Students, Generalizability Theory, Institutional Characteristics

Patterns of Gender Differences on Mathematics Items on the Scholastic Aptitude Test.

Peer reviewed

Harris, Abigail M.; Carlton, Sydell T. – Applied Measurement in Education, 1993

Differential item functioning on 6 forms of the Scholastic Aptitude Test was examined for 181,228 male and 198,668 female students focusing on the points tested, the test format, and subject matter in which items are embedded. Implications of the identifiable differences are discussed. (SLD)

Descriptors: College Entrance Examinations, Comparative Analysis, Females, High School Students

Boyer, Michelle	1
Carlton, Sydell T.	1
Davis, Laurie Laughlin	1
Fitzpatrick, Anne R.	1
Gao, Furong	1
Harris, Abigail M.	1
Keng, Leslie	1
Kieftenbeld, Vincent	1
Kim, Seonghoon	1
Kolen, Michael J.	1
Lee, Guemin	1
Liang, Tie	1
Liu, Ou Lydia	1
Lixin Yuan	1
McClarty, Katie Larsen	1
Minqiang Zhang	1
Olea, Julio	1
Ponsoda, Vicente	1
Revuelta, Javier	1
Rodriguez, Maria Soledad	1
Shaojie Wang	1
Wells, Craig S.	1
Wilson, Mark	1
Won-Chan Lee	1
More ▼