Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Comparative Analysis | 24 |
| Test Validity | 17 |
| Validity | 6 |
| Multiple Choice Tests | 5 |
| Test Reliability | 5 |
| Achievement Tests | 4 |
| Item Analysis | 4 |
| Models | 4 |
| Scores | 4 |
| Scoring | 4 |
| Statistical Analysis | 4 |
| More ▼ | |
Source
| Journal of Educational… | 24 |
Author
| Hakstian, A. Ralph | 2 |
| Kansup, Wanlop | 2 |
| Algina, James | 1 |
| Baldwin, Peter | 1 |
| Bejar, Isaac I. | 1 |
| Bucak, Deniz | 1 |
| Clauser, Brian E. | 1 |
| Cohen, Allan S. | 1 |
| Crehan, Kevin D. | 1 |
| DeCarlo, Lawrence T. | 1 |
| Ebel, Robert L. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 16 |
| Reports - Research | 12 |
| Reports - Evaluative | 3 |
| Speeches/Meeting Papers | 2 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
Education Level
| Secondary Education | 2 |
| Elementary Education | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
Audience
| Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
| College and University… | 1 |
| National Teacher Examinations | 1 |
| Program for International… | 1 |
What Works Clearinghouse Rating
Kaiwen Man; Joni M. Lakin – Journal of Educational Measurement, 2024
Eye-tracking procedures generate copious process data that could be valuable in establishing the response processes component of modern validity theory. However, there is a lack of tools for assessing and visualizing response processes using process data such as eye-tracking fixation sequences, especially those suitable for young children. This…
Descriptors: Problem Solving, Spatial Ability, Task Analysis, Network Analysis
Shear, Benjamin R. – Journal of Educational Measurement, 2023
Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…
Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Baldwin, Peter; Margolis, Melissa J.; Bucak, Deniz; Jodoin, Michael; Walsh, William; Haist, Steven – Journal of Educational Measurement, 2018
Test administrators are appropriately concerned about the potential for time constraints to impact the validity of score interpretations; psychometric efforts to evaluate the impact of speededness date back more than half a century. The widespread move to computerized test delivery has led to the development of new approaches to evaluating how…
Descriptors: Comparative Analysis, Observation, Medical Education, Licensing Examinations (Professions)
Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014
In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…
Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis
Peer reviewedCrehan, Kevin D. – Journal of Educational Measurement, 1974
Various item selection techniques are compared on criterion-referenced reliability and validity. Techniques compared include three nominal criterion-referenced methods, a traditional point biserial selection, teacher selection, and random selection. (Author)
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Analysis, Item Banks
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2005
An approach to essay grading based on signal detection theory (SDT) is presented. SDT offers a basis for understanding rater behavior with respect to the scoring of construct responses, in that it provides a theory of psychological processes underlying the raters' behavior. The approach also provides measures of the precision of the raters and the…
Descriptors: Validity, Simulation, Grading, Item Response Theory
Peer reviewedEbel, Robert L. – Journal of Educational Measurement, 1975
Descriptors: Comparative Analysis, Multiple Choice Tests, Objective Tests, Teachers
Peer reviewedHartnett, Rodney T. – Journal of Educational Measurement, 1971
Alternative scoring methods yield essentially the same information, including scale intercorrelations and validity. Reasons for preferring the traditional psychometric scoring technique are offered. (Author/AG)
Descriptors: College Environment, Comparative Analysis, Correlation, Item Analysis
Peer reviewedWilliamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999
Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)
Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges
Peer reviewedLomax, Richard G.; Algina, James – Journal of Educational Measurement, 1979
Results of using multimethod factor analysis and exploratory factor analysis for the analysis of three multitrait-multimethod matrices are compared. Results suggest that the two methods can give quite different impressions of discriminant validity. In the examples considered, the former procedure tends to support discrimination while the latter…
Descriptors: Comparative Analysis, Factor Analysis, Goodness of Fit, Matrices
Peer reviewedWardrop, James L.; And Others – Journal of Educational Measurement, 1982
A structure for describing different approaches to testing is generated by identifying five dimensions along which tests differ: test uses, item generation, item revision, assessment of precision, and validation. These dimensions are used to profile tests of reading comprehension. Only norm-referenced achievement tests had an inference system…
Descriptors: Achievement Tests, Comparative Analysis, Educational Testing, Models
Peer reviewedMedley, Donald M.; Quirk, Thomas J. – Journal of Educational Measurement, 1974
Descriptors: Blacks, Comparative Analysis, Culture Fair Tests, Item Analysis
Peer reviewedFarr, Roger; Roelke, Patricia – Journal of Educational Measurement, 1971
Descriptors: Classroom Observation Techniques, Comparative Analysis, Measurement Techniques, Rating Scales
Peer reviewedHanna, Gila – Journal of Educational Measurement, 1984
The validity of a comparison of mean test scores for two groups and of a longitudinal comparison of means within each group is assessed. Using LISREL, factor analyses are used to test the hypotheses of similar factor patterns, equal units of measurement, and equal measurement accuracy between groups and across time. (Author/DWH)
Descriptors: Achievement Tests, Comparative Analysis, Data Analysis, Factor Analysis
Peer reviewedKoehler, Roger A. – Journal of Educational Measurement, 1971
Descriptors: Achievement Tests, Comparative Analysis, Confidence Testing, Grade 11
Previous Page | Next Page ยป
Pages: 1 | 2
Direct link
