ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	9

Descriptor

Testing Problems	25
Elementary Secondary Education	7
Test Construction	6
Higher Education	5
Item Response Theory	5
Multiple Choice Tests	5
Scores	5
Test Items	5
College Entrance Examinations	4
Educational Assessment	4
Psychometrics	4
Standardized Tests	4
Test Format	4
Achievement Tests	3
Comparative Analysis	3
Difficulty Level	3
Equated Scores	3
Evaluation Methods	3
Item Analysis	3
Licensing Examinations…	3
Sample Size	3
Scoring	3
State Programs	3
Test Bias	3
Test Validity	3
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	25
Reports - Research	15
Reports - Evaluative	9
Speeches/Meeting Papers	4
Information Analyses	2
Reports - Descriptive	1

Education Level

Higher Education	4
Postsecondary Education	4
Elementary Secondary Education	1

Audience

Location

Florida

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Armed Services Vocational…	1
Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Personalized Online Learning, Test Fairness, and Educational Measurement: Considering Differential Content Exposure Prior to a High Stakes End of Course Exam

Peer reviewed

Direct link

Daniel Katz; Anne Corinne Huggins-Manley; Walter Leite – Applied Measurement in Education, 2022

According to the "Standards for Educational and Psychological Testing" (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student…

Descriptors: High Stakes Tests, Test Bias, Testing Problems, Prior Learning

Rasch Model Extensions for Enhanced Formative Assessments in MOOCs

Peer reviewed

Direct link

Abbakumov, Dmitry; Desmet, Piet; Van den Noortgate, Wim – Applied Measurement in Education, 2020

Formative assessments are an important component of massive open online courses (MOOCs), online courses with open access and unlimited student participation. Accurate conclusions on students' proficiency via formative, however, face several challenges: (a) students are typically allowed to make several attempts; and (b) student performance might…

Descriptors: Item Response Theory, Formative Evaluation, Online Courses, Response Style (Tests)

Challenges to the Cattell-Horn-Carroll Theory: Empirical, Clinical, and Policy Implications

Peer reviewed

Direct link

Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019

The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…

Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests

Investigating Repeater Effects on Small Sample Equating: Include or Exclude?

Peer reviewed

Direct link

Diao, Hongyu; Keller, Lisa – Applied Measurement in Education, 2020

Examinees who attempt the same test multiple times are often referred to as "repeaters." Previous studies suggested that repeaters should be excluded from the total sample before equating because repeater groups are distinguishable from non-repeater groups. In addition, repeaters might memorize anchor items, causing item drift under a…

Descriptors: Licensing Examinations (Professions), College Entrance Examinations, Repetition, Testing Problems

Are Multiple-Choice Items Too Fat?

Peer reviewed

Direct link

Haladyna, Thomas M.; Rodriguez, Michael C.; Stevens, Craig – Applied Measurement in Education, 2019

The evidence is mounting regarding the guidance to employ more three-option multiple-choice items. From theoretical analyses, empirical results, and practical considerations, such items are of equal or higher quality than four- or five-option items, and more items can be administered to improve content coverage. This study looks at 58 tests,…

Descriptors: Multiple Choice Tests, Test Items, Testing Problems, Guessing (Tests)

Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

Peer reviewed

Direct link

Sinharay, Sandip – Applied Measurement in Education, 2017

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis

A New Procedure for Detection of Students' Rapid Guessing Responses Using Response Time

Peer reviewed

Direct link

Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016

Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…

Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Testing for Differences in Test Score Distributions Using Loglinear Models.

Peer reviewed

Hanson, Bradley A. – Applied Measurement in Education, 1996

Determining whether score distributions differ on two or more test forms administered to samples of examinees from a single population is explored using three statistical tests using loglinear models. Examples are presented of applying tests of distribution differences to decide if equating is needed for alternative forms of a test. (SLD)

Descriptors: Equated Scores, Scoring, Statistical Distributions, Test Format

Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…

Descriptors: Error of Measurement, Reliability, Scores, Test Construction

Simultaneous Use of Multiple Answer Copying Indexes to Improve Detection Rates

Peer reviewed

Direct link

Wollack, James A. – Applied Measurement in Education, 2006

Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…

Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size

Effects of Content Polarization, Item Wording, and Rating Scale Width on Rating Responses.

Peer reviewed

Lam, Tony C. M.; Stevens, Joseph J. – Applied Measurement in Education, 1994

Effects of the following three variables on rating scale response were studied: (1) polarization of opinion regarding scale content; (2) intensity of item wording; and (3) psychological width of the scale. Results with 167 college students suggest best ways to balance polarization and item wording regardless of scale width. (SLD)

Descriptors: College Students, Content Analysis, Higher Education, Rating Scales

Training Test-Wiseness and Flawed Item Types.

Peer reviewed

Roznowski, Mary; Bassett, James – Applied Measurement in Education, 1992

Current coaching practices used in training test wiseness for analogy items on standardized test batteries were investigated in a 3-group design involving about 100 undergraduates in each condition. The largest improvement came in items in the middle range of difficulty, but overall effects of coaching were important. (SLD)

Descriptors: Difficulty Level, Higher Education, Standardized Tests, Teaching Methods

Effects of Scale Anchors on Student Ratings of Instructors.

Peer reviewed

Dunham, Trudy C.; Davison, Mark L. – Applied Measurement in Education, 1990

The effects of packing or skewing the response options of a scale on the common measurement problems of leniency and range restriction in instructor ratings were assessed. Results from a sample of 130 undergraduate education students indicate that packing reduced leniency but had no effect on range restriction. (TJH)

Descriptors: Education Majors, Higher Education, Professors, Rating Scales

Educational Assessment as a Promising Area for Psychometric Research.

Peer reviewed

Jones, Lyle V. – Applied Measurement in Education, 1988

Use of multiple-choice achievement tests is critiqued. Multiple-choice tests are considered heavily weighted toward aptitude and ill-suited to assessment of thinking. Psychometric methods for the development of alternatives to this inadequate form of testing achievement are discussed. (TJH)

Descriptors: Achievement Tests, Creative Thinking, Educational Assessment, Educational Research

Previous Page | Next Page »

Pages: 1 | 2

Abbakumov, Dmitry	1
Anne Corinne Huggins-Manley	1
Aschbacher, Pamela R.	1
Bassett, James	1
Canivez, Gary L.	1
Daniel Katz	1
Davison, Mark L.	1
Desmet, Piet	1
Diao, Hongyu	1
Dunham, Trudy C.	1
Feldt, Leonard S.	1
Frary, Robert B.	1
Geisinger, Kurt F.	1
Green, Bert F.	1
Guo, Hongwen	1
Haberman, Shelby	1
Haladyna, Thomas A.	1
Haladyna, Thomas M.	1
Hambleton, Ronald K.	1
Hanson, Bradley A.	1
Holland, Paul W.	1
Jones, Lyle V.	1
Keller, Lisa	1
Lam, Tony C. M.	1
Liu, Ou Lydia	1
More ▼