NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Brian E. Clauser; Victoria Yaneva; Peter Baldwin; Le An Ha; Janet Mee – Applied Measurement in Education, 2024
Multiple-choice questions have become ubiquitous in educational measurement because the format allows for efficient and accurate scoring. Nonetheless, there remains continued interest in constructed-response formats. This interest has driven efforts to develop computer-based scoring procedures that can accurately and efficiently score these items.…
Descriptors: Computer Uses in Education, Artificial Intelligence, Scoring, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020
This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…
Descriptors: Scoring, Test Format, Responses, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Peer reviewed Peer reviewed
Hanson, Bradley A. – Applied Measurement in Education, 1996
Determining whether score distributions differ on two or more test forms administered to samples of examinees from a single population is explored using three statistical tests using loglinear models. Examples are presented of applying tests of distribution differences to decide if equating is needed for alternative forms of a test. (SLD)
Descriptors: Equated Scores, Scoring, Statistical Distributions, Test Format
Peer reviewed Peer reviewed
Frisbie, David A.; Becker, Douglas F. – Applied Measurement in Education, 1990
Seventeen educational measurement textbooks were reviewed to analyze current perceptions regarding true-false achievement testing. A synthesis of the rules for item writing is presented, and the purported advantages and disadvantages of the true-false format derived from those texts are reviewed. (TJH)
Descriptors: Achievement Tests, Higher Education, Methods Courses, Objective Tests
Peer reviewed Peer reviewed
Dunham, Trudy C.; Davison, Mark L. – Applied Measurement in Education, 1990
The effects of packing or skewing the response options of a scale on the common measurement problems of leniency and range restriction in instructor ratings were assessed. Results from a sample of 130 undergraduate education students indicate that packing reduced leniency but had no effect on range restriction. (TJH)
Descriptors: Education Majors, Higher Education, Professors, Rating Scales
Peer reviewed Peer reviewed
Martinez, Michael E.; Bennett, Randy Elliot – Applied Measurement in Education, 1992
New developments in the use of automatically scorable constructed response item types for large-scale assessment are reviewed for five domains: (1) mathematical reasoning; (2) algebra problem solving; (3) computer science; (4) architecture; and (5) natural language. Ways in which these technologies are likely to shape testing are considered. (SLD)
Descriptors: Algebra, Architecture, Automation, Computer Science