NotesFAQContact Us
Collection
Advanced
Search Tips
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 40 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…
Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…
Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Yunxiao; Lee, Yi-Hsuan; Li, Xiaoou – Journal of Educational and Behavioral Statistics, 2022
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric…
Descriptors: Standardized Tests, Test Items, Test Validity, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Camenares, Devin – International Journal for the Scholarship of Teaching and Learning, 2022
Balancing assessment of learning outcomes with the expectations of students is a perennial challenge in education. Difficult exams, in which many students perform poorly, exacerbate this problem and can inspire a wide variety of interventions, such as a grading curve. However, addressing poor performance can sometimes distort or inflate grades and…
Descriptors: College Students, Student Evaluation, Tests, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kim, Sooyeon; Walker, Michael – ETS Research Report Series, 2021
In this investigation, we used real data to assess potential differential effects associated with taking a test in a test center (TC) versus testing at home using remote proctoring (RP). We used a pseudo-equivalent groups (PEG) approach to examine group equivalence at the item level and the total score level. If our assumption holds that the PEG…
Descriptors: Testing, Distance Education, Comparative Analysis, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018
Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…
Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2017
Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…
Descriptors: Goodness of Fit, Testing, Test Items, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016
Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…
Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009
Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…
Descriptors: Test Bias, Test Items, Evaluation Methods, Scores
Townsend, Michael A. R.; Mahoney, Peggy – 1980
The roles of humor and anxiety in test performance were investigated. Measures of trait anxiety, state anxiety and achievement were obtained on a sample of undergraduate students; the A-Trait and A-State scales of the State-Trait Anxiety Inventory were used. Half of the students received additional humorous items in the achievement test. The…
Descriptors: Achievement Tests, Anxiety, Higher Education, Humor
Haenn, Joseph F. – 1981
Procedures for conducting functional level testing have been available for use by practitioners for some time. However, the Title I Evaluation and Reporting System (TIERS), developed in response to the educational amendments of 1974 to the Elementary and Secondary Education Act (ESEA), has provided the impetus for widespread adoption of this…
Descriptors: Achievement Tests, Difficulty Level, Scores, Scoring
Curry, Allen R.; Riegel, N. Blyth – 1978
The Rasch model of test theory is described in general terms, compared with latent trait theory, and shown to have interesting applications for the measurement of affective as well as cognitive traits. Three assumption of the Rasch model are stated to support the conclusion that calibration of the items and tests is independent of the examinee…
Descriptors: Affective Measures, Goodness of Fit, Item Analysis, Latent Trait Theory
Klein, Stephen P.; Bolus, Roger – 1983
A solution to reduce the likelihood of one examinee copying another's answers on large scale tests that require all examinees to answer the same set of questions is to use multiple test forms that differ in terms of item ordering. This study was conducted to determine whether varying the sequence in which blocks of items were presented to…
Descriptors: Adults, Cheating, Cost Effectiveness, Item Analysis
PDF pending restoration PDF pending restoration
Mills, Craig N.; Hambleton, Ronald K. – 1980
General guidelines exist for reporting and interpreting test scores, but there are short comings in the available technology, especially when applied to criterion-referenced tests. Concerns that have been expressed in the educational measurement literature address the uses of test scores, the manner of reporting scores, limited testing knowledge…
Descriptors: Criterion Referenced Tests, Educational Objectives, Elementary Secondary Education, Guidelines
Peer reviewed Peer reviewed
Wilson, Sandra Meachan; Hiscox, Michael D. – Educational Measurement: Issues and Practice, 1984
This article presents a model that can be used by local school districts for reanalyzing standardized test results to obtain a more valid assessment of local learning objectives can be used to identify strengths/weaknesses of existing programs as well as individual students. (EGS)
Descriptors: Educational Objectives, Item Analysis, Models, School Districts
Previous Page | Next Page »
Pages: 1  |  2  |  3