ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	10

Descriptor

Achievement Tests	16
Item Response Theory	10
Test Items	6
Elementary Secondary Education	5
International Assessment	5
Foreign Countries	4
Classification	3
Guessing (Tests)	3
Mathematics Achievement	3
Response Style (Tests)	3
Simulation	3
Comparative Analysis	2
Computer Assisted Testing	2
Cutting Scores	2
Difficulty Level	2
Evaluation Methods	2
Grade 3	2
Grade 4	2
Grade 7	2
Grade 8	2
Mathematics Tests	2
Measurement	2
Multiple Choice Tests	2
Nonparametric Statistics	2
Reading Achievement	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	16
Reports - Research	9
Reports - Evaluative	6
Speeches/Meeting Papers	2
Reports - Descriptive	1

Education Level

Secondary Education	5
Elementary Education	4
Elementary Secondary Education	4
Grade 8	4
Grade 3	3
Grade 4	3
Grade 5	3
Grade 7	3
Middle Schools	3
Grade 6	2
High Schools	2
Intermediate Grades	2
Junior High Schools	2
Early Childhood Education	1
Grade 11	1
Grade 2	1
Grade 9	1
Primary Education	1
More ▼

Audience

Location

Finland	1
Florida	1
Germany	1
Iowa	1
Iran (Tehran)	1
Italy	1
Romania	1
Russia	1
United Kingdom (Northern…	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Trends in International…	2
College Board Achievement…	1
Florida Comprehensive…	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
Measures of Academic Progress	1
National Assessment of…	1
Progress in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Identifying Disengaged Survey Responses: New Evidence Using Response Time Metadata

Peer reviewed

Direct link

Soland, James; Wise, Steven L.; Gao, Lingyun – Applied Measurement in Education, 2019

Disengaged responding is a phenomenon that often biases observed scores from achievement tests and surveys in practically and statistically significant ways. This problem has led to the development of methods to detect and correct for disengaged responses on both achievement test and survey scores. One major disadvantage when trying to detect…

Descriptors: Reaction Time, Metadata, Response Style (Tests), Student Surveys

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Performance Decline as an Indicator of Generalized Test-Taking Disengagement

Peer reviewed

Direct link

Wise, Steven L.; Kingsbury, G. Gage – Applied Measurement in Education, 2022

In achievement testing we assume that students will demonstrate their maximum performance as they encounter test items. Sometimes, however, student performance can decline during a test event, which implies that the test score does not represent maximum performance. This study describes a method for identifying significant performance decline and…

Descriptors: Achievement Tests, Performance, Classification, Guessing (Tests)

A General Approach to Measuring Test-Taking Effort on Computer-Based Tests

Peer reviewed

Direct link

Wise, Steven L.; Gao, Lingyun – Applied Measurement in Education, 2017

There has been an increased interest in the impact of unmotivated test taking on test performance and score validity. This has led to the development of new ways of measuring test-taking effort based on item response time. In particular, Response Time Effort (RTE) has been shown to provide an assessment of effort down to the level of individual…

Descriptors: Test Bias, Computer Assisted Testing, Item Response Theory, Achievement Tests

Is Teacher Value Added a Matter of Scale? The Practical Consequences of Treating an Ordinal Scale as Interval for Estimation of Teacher Effects

Peer reviewed

Direct link

Soland, James – Applied Measurement in Education, 2017

Research shows that assuming a test scale is equal-interval can be problematic, especially when the assessment is being used to achieve a policy aim like evaluating growth over time. However, little research considers whether teacher value added is sensitive to the underlying test scale, and in particular whether treating an ordinal scale as…

Descriptors: Intervals, Value Added Models, Teacher Evaluation, Teacher Effectiveness

Negative Keying Effects in the Factor Structure of TIMSS 2011 Motivation Scales and Associations with Reading Achievement

Peer reviewed

Direct link

Michaelides, Michalis P. – Applied Measurement in Education, 2019

The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…

Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests

Diagnosing Competency Mastery in Science: An Application of GDM to TIMSS 2011 Data

Peer reviewed

Direct link

Kabiri, Masoud; Ghazi-Tabatabaei, Mahmood; Bazargan, Abbas; Shokoohi-Yekta, Mohsen; Kharrazi, Kamal – Applied Measurement in Education, 2017

Numerous diagnostic studies have been conducted on large-scale assessments to illustrate the students' mastery profile in the areas of math and reading; however, for science a limited number of investigations are reported. This study investigated Iranian eighth graders' competency mastery of science and examined the utility of the General…

Descriptors: Elementary Secondary Education, Achievement Tests, International Assessment, Foreign Countries

Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

Peer reviewed

Direct link

Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012

Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement

Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests

Peer reviewed

Direct link

Tong, Ye; Kolen, Michael J. – Applied Measurement in Education, 2007

A number of vertical scaling methodologies were examined in this article. Scaling variations included data collection design, scaling method, item response theory (IRT) scoring procedure, and proficiency estimation method. Vertical scales were developed for Grade 3 through Grade 8 for 4 content areas and 9 simulated datasets. A total of 11 scaling…

Descriptors: Achievement Tests, Scaling, Methods, Item Response Theory

The Use of a Person-Fit Statistic with One High-Quality Achievement Test.

Peer reviewed

Rudner, Lawrence M.; And Others – Applied Measurement in Education, 1996

An analysis of data from the 1990 National Assessment of Educational Progress Trial State Assessment suggests that person-fit statistics may not provide additional information about results of psychometrically strong achievement tests. More research is needed before person-fit statistics can be used routinely in analysis of item response data.…

Descriptors: Achievement Tests, Individual Differences, Item Response Theory, Psychometrics

Sensitivity of Equating Results to Different Sampling Strategies.

Peer reviewed

Schmitt, Alicia P.; And Others – Applied Measurement in Education, 1990

Equating two parallel forms of the College Board Biology Achievement Test using three sampling strategies was examined. For each strategy, five equating procedures were studied: Tucker and Levine equally reliable linear equatings; frequency estimation equipercentile equatings; chained equipercentile curvilinear equatings; and three-parameter…

Descriptors: Achievement Tests, Biology, College Entrance Examinations, Equated Scores

Sex Differences in the Tendency to Omit Items on Multiple-Choice Tests: 1980-2000

Peer reviewed

Direct link

von Schrader, Sarah; Ansley, Timothy – Applied Measurement in Education, 2006

Much has been written concerning the potential group differences in responding to multiple-choice achievement test items. This discussion has included references to possible disparities in tendency to omit such test items. When test scores are used for high-stakes decision making, even small differences in scores and rankings that arise from male…

Descriptors: Gender Differences, Multiple Choice Tests, Achievement Tests, Grade 3

Has Item Response Theory Increased the Validity of Achievement Test Scores?

Peer reviewed

Linn, Robert L. – Applied Measurement in Education, 1990

The contribution of item response theory to the validity of interpretations of achievement test results is reviewed in the context of four applications. The applications include construction of scales for achievement tests, test construction, development of customized tests, and investigation of the influence of instruction on achievement tests.…

Descriptors: Achievement Tests, Elementary Secondary Education, Instructional Effectiveness, Item Response Theory

The Utility of a Modified One-Parameter IRT Model with Small Samples.

Peer reviewed

Barnes, Laura L. B.; Wise, Steven L. – Applied Measurement in Education, 1991

One-parameter and three-parameter item response theory (IRT) model estimates were compared with estimates obtained from two modified one-parameter models that incorporated a constant nonzero guessing parameter. Using small-sample simulation data (50, 100, and 200 simulated examinees), modified 1-parameter models were most effective in estimating…

Descriptors: Ability, Achievement Tests, Comparative Analysis, Computer Simulation

Nonparametric Person-Fit Research: Some Theoretical Issues and an Empirical Example.

Peer reviewed

Meijer, Rob R.; And Others – Applied Measurement in Education, 1996

Several existing group-based statistics to detect improbable item score patterns are discussed, along with the cut scores proposed in the literature to classify an item score pattern as aberrant. A simulation study and an empirical study are used to compare the statistics and their use and to investigate the practical use of cut scores. (SLD)

Descriptors: Achievement Tests, Classification, Cutting Scores, Identification

Previous Page | Next Page »

Pages: 1 | 2

Wise, Steven L.	4
Gao, Lingyun	2
Soland, James	2
Abulela, Mohammed A. A.	1
Ansley, Timothy	1
Barnes, Laura L. B.	1
Bazargan, Abbas	1
Cohen, Allan S.	1
Crooks, Terence J.	1
Ghazi-Tabatabaei, Mahmood	1
Henly, George A.	1
Kabiri, Masoud	1
Kane, Michael T.	1
Kharrazi, Kamal	1
Kingsbury, G. Gage	1
Kolen, Michael J.	1
Linn, Robert L.	1
Meijer, Rob R.	1
Michaelides, Michalis P.	1
Rios, Joseph A.	1
Rudner, Lawrence M.	1
Schmitt, Alicia P.	1
Shokoohi-Yekta, Mohsen	1
Tong, Ye	1
More ▼