ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	4

Descriptor

Comparative Analysis	24
Test Validity	17
Validity	6
Multiple Choice Tests	5
Test Reliability	5
Achievement Tests	4
Item Analysis	4
Models	4
Scores	4
Scoring	4
Statistical Analysis	4
Test Items	4
Computer Assisted Testing	3
Confidence Testing	3
Guessing (Tests)	3
Psychometrics	3
Responses	3
Scoring Formulas	3
Simulation	3
Test Format	3
Test Use	3
Adaptive Testing	2
Computer Simulation	2
Educational Assessment	2
Educational Testing	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	16
Reports - Research	12
Reports - Evaluative	3
Speeches/Meeting Papers	2
Information Analyses	1
Opinion Papers	1

Education Level

Secondary Education	2
Elementary Education	1
Junior High Schools	1
Middle Schools	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

College and University…	1
National Teacher Examinations	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

An Exploratory Study Using Innovative Graphical Network Analysis to Model Eye Movements in Spatial Reasoning Problem Solving

Peer reviewed

Direct link

Kaiwen Man; Joni M. Lakin – Journal of Educational Measurement, 2024

Eye-tracking procedures generate copious process data that could be valuable in establishing the response processes component of modern validity theory. However, there is a lack of tools for assessing and visualizing response processes using process data such as eye-tracking fixation sequences, especially those suitable for young children. This…

Descriptors: Problem Solving, Spatial Ability, Task Analysis, Network Analysis

Gender Bias in Test Item Formats: Evidence from PISA 2009, 2012, and 2015 Math and Reading Tests

Peer reviewed

Direct link

Shear, Benjamin R. – Journal of Educational Measurement, 2023

Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…

Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

Peer reviewed

Direct link

Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Baldwin, Peter; Margolis, Melissa J.; Bucak, Deniz; Jodoin, Michael; Walsh, William; Haist, Steven – Journal of Educational Measurement, 2018

Test administrators are appropriately concerned about the potential for time constraints to impact the validity of score interpretations; psychometric efforts to evaluate the impact of speededness date back more than half a century. The widespread move to computerized test delivery has led to the development of new approaches to evaluating how…

Descriptors: Comparative Analysis, Observation, Medical Education, Licensing Examinations (Professions)

Detection of Invalid Test Scores: The Usefulness of Simple Nonparametric Statistics

Peer reviewed

Direct link

Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014

In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…

Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis

Item Analysis for Teacher-Made Mastery Tests

Peer reviewed

Crehan, Kevin D. – Journal of Educational Measurement, 1974

Various item selection techniques are compared on criterion-referenced reliability and validity. Techniques compared include three nominal criterion-referenced methods, a traditional point biserial selection, teacher selection, and random selection. (Author)

Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Analysis, Item Banks

A Model of Rater Behavior in Essay Grading Based on Signal Detection Theory

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2005

An approach to essay grading based on signal detection theory (SDT) is presented. SDT offers a basis for understanding rater behavior with respect to the scoring of construct responses, in that it provides a theory of psychological processes underlying the raters' behavior. The approach also provides measures of the precision of the raters and the…

Descriptors: Validity, Simulation, Grading, Item Response Theory

Can Teachers Write Good True-False Test Items?

Peer reviewed

Ebel, Robert L. – Journal of Educational Measurement, 1975

Descriptors: Comparative Analysis, Multiple Choice Tests, Objective Tests, Teachers

A Note on the Comparability of Alternative Scoring Methods for the Institutional Functioning Inventory

Peer reviewed

Hartnett, Rodney T. – Journal of Educational Measurement, 1971

Alternative scoring methods yield essentially the same information, including scale intercorrelations and validity. Reasons for preferring the traditional psychometric scoring technique are offered. (Author/AG)

Descriptors: College Environment, Comparative Analysis, Correlation, Item Analysis

"Mental Model" Comparison of Automated and Human Scoring.

Peer reviewed

Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999

Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)

Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges

Comparison of Two Procedures for Analyzing Multitrait Multimethod Matrices.

Peer reviewed

Lomax, Richard G.; Algina, James – Journal of Educational Measurement, 1979

Results of using multimethod factor analysis and exploratory factor analysis for the analysis of three multitrait-multimethod matrices are compared. Results suggest that the two methods can give quite different impressions of discriminant validity. In the examples considered, the former procedure tends to support discrimination while the latter…

Descriptors: Comparative Analysis, Factor Analysis, Goodness of Fit, Matrices

A Framework for Analyzing the Inference Structure of Educational Achievement Tests.

Peer reviewed

Wardrop, James L.; And Others – Journal of Educational Measurement, 1982

A structure for describing different approaches to testing is generated by identifying five dimensions along which tests differ: test uses, item generation, item revision, assessment of precision, and validation. These dimensions are used to profile tests of reading comprehension. Only norm-referenced achievement tests had an inference system…

Descriptors: Achievement Tests, Comparative Analysis, Educational Testing, Models

The Application of a Factorial Design to the Study of Cultural Bias in General Culture Items on the National Teacher Examination

Peer reviewed

Medley, Donald M.; Quirk, Thomas J. – Journal of Educational Measurement, 1974

Descriptors: Blacks, Comparative Analysis, Culture Fair Tests, Item Analysis

Measuring Subskills of Reading: Intercorrelations Between Standardized Reading Tests, Teachers' Ratings, and Reading Specialists' Ratings

Peer reviewed

Farr, Roger; Roelke, Patricia – Journal of Educational Measurement, 1971

Descriptors: Classroom Observation Techniques, Comparative Analysis, Measurement Techniques, Rating Scales

The Use of a Factor-Analysis Model for Assessing the Validity of Group Comparisons.

Peer reviewed

Hanna, Gila – Journal of Educational Measurement, 1984

The validity of a comparison of mean test scores for two groups and of a longitudinal comparison of means within each group is assessed. Using LISREL, factor analyses are used to test the hypotheses of similar factor patterns, equal units of measurement, and equal measurement accuracy between groups and across time. (Author/DWH)

Descriptors: Achievement Tests, Comparative Analysis, Data Analysis, Factor Analysis

A Comparison of the Validities of Conventional Choice Testing and Various Confidence Marking Procedures

Peer reviewed

Koehler, Roger A. – Journal of Educational Measurement, 1971

Descriptors: Achievement Tests, Comparative Analysis, Confidence Testing, Grade 11

Previous Page | Next Page »

Pages: 1 | 2

Hakstian, A. Ralph	2
Kansup, Wanlop	2
Algina, James	1
Baldwin, Peter	1
Bejar, Isaac I.	1
Bucak, Deniz	1
Clauser, Brian E.	1
Cohen, Allan S.	1
Crehan, Kevin D.	1
DeCarlo, Lawrence T.	1
Ebel, Robert L.	1
Farr, Roger	1
Frary, Robert B.	1
Grabovsky, Irina	1
Haist, Steven	1
Hanna, Gila	1
Harik, Polina	1
Hartnett, Rodney T.	1
Hone, Anne S.	1
Jaeger, Richard M.	1
Jodoin, Michael	1
Joni M. Lakin	1
Kaiwen Man	1
Kim, Seock-Ho	1
More ▼