ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	12

Source

Educational Measurement:…

Publication Type

Journal Articles	12
Reports - Research	8
Reports - Evaluative	2
Information Analyses	1
Reports - Descriptive	1

Education Level

Secondary Education	2
Elementary Education	1
Grade 4	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Postsecondary Education	1

Audience

Practitioners	1
Students	1

Location

California

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms

Peer reviewed

Direct link

Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025

This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…

Descriptors: Generalizability Theory, Automation, Test Items, Students

Linking Unlinkable Tests: A Step Forward

Peer reviewed

Direct link

Silvia Testa; Renato Miceli; Renato Miceli – Educational Measurement: Issues and Practice, 2025

Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE--that may only be used when the individuals taking the two tests come from the same…

Descriptors: Comparative Testing, Heuristics, Problem Solving, Personality Traits

Digital Module 34: Introduction to Multilevel Measurement Modeling

Peer reviewed

Direct link

Shaw, Mairead; Flake, Jessica K. – Educational Measurement: Issues and Practice, 2023

Clustered data structures are common in many areas of educational and psychological research (e.g., students clustered in schools, patients clustered by clinician). In the course of conducting research, questions are often administered to obtain scores reflecting latent constructs. Multilevel measurement models (MLMMs) allow for modeling…

Descriptors: Hierarchical Linear Modeling, Research Methodology, Data Analysis, Structural Equation Models

What Mathematics Content Do Teachers Teach? Optimizing Measurement of Opportunities to Learn in the Classroom

Peer reviewed

Direct link

Jiahui Zhang; William H. Schmidt – Educational Measurement: Issues and Practice, 2024

Measuring opportunities to learn (OTL) is crucial for evaluating education quality and equity, but obtaining accurate and comprehensive OTL data at a large scale remains challenging. We attempt to address this issue by investigating measurement concerns in data collection and sampling. With the primary goal of estimating group-level OTLs for large…

Descriptors: Educational Opportunities, Measurement Techniques, Data Collection, Grade 4

Communicating Measurement Outcomes with (Better) Graphics

Peer reviewed

Direct link

Setzer, J. Carl; Cui, Zhongmin – Educational Measurement: Issues and Practice, 2022

Data visualization is a core tenet of communicating measurement research and outcomes. Measurement professionals utilize data visualization in various phases of research, including exploration and communication. However, data visualization has not received enough attention in the measurement field. While it is true that many measurement graphics…

Descriptors: Measures (Individuals), Outcome Measures, Visual Aids, Data Analysis

Deriving Decisions from Disrupted Data

Peer reviewed

Direct link

Sireci, Stephen G.; Suarez-Alvarez, Javier – Educational Measurement: Issues and Practice, 2022

The COVID-19 pandemic negatively affected the quality of data from educational testing programs. These data were previously used for many important purposes ranging from placing students in instructional programs to school accountability. In this article, we draw from the research design literature to point out the limitations inherent in…

Descriptors: Decision Making, Data Use, COVID-19, Pandemics

Applications and Modeling of Keystroke Logs in Writing Assessments

Peer reviewed

Direct link

Mo Zhang; Paul Deane; Andrew Hoang; Hongwen Guo; Chen Li – Educational Measurement: Issues and Practice, 2025

In this paper, we describe two empirical studies that demonstrate the application and modeling of keystroke logs in writing assessments. We illustrate two different approaches of modeling differences in writing processes: analysis of mean differences in handcrafted theory-driven features and use of large language models to identify stable personal…

Descriptors: Writing Tests, Computer Assisted Testing, Keyboarding (Data Entry), Writing Processes

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

The Role of Response Style Adjustments in Cross-Country Comparisons--A Case Study Using Data from the PISA 2015 Questionnaire

Peer reviewed

Direct link

Ulitzsch, Esther; Lüdtke, Oliver; Robitzsch, Alexander – Educational Measurement: Issues and Practice, 2023

Country differences in response styles (RS) may jeopardize cross-country comparability of Likert-type scales. When adjusting for rather than investigating RS is the primary goal, it seems advantageous to impose minimal assumptions on RS structures and leverage information from multiple scales for RS measurement. Using PISA 2015 background…

Descriptors: Response Style (Tests), Comparative Analysis, Achievement Tests, Foreign Countries

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

How Did Students Engage with a Remote Educational Assessment? A Case Study

Peer reviewed

Direct link

Guo, Hongwen – Educational Measurement: Issues and Practice, 2022

Many educational summative and formative assessments have been transferred to a remote online setting because of the pandemic. Educational professionals and stakeholders have shown interest in learning how this change in the test mode influenced test takers; that is, whether test-taking experiences in a remote test setting were different from…

Descriptors: Distance Education, Educational Assessment, Student Evaluation, Summative Evaluation

The University of California Was Wrong to Abolish the SAT: Admissions When Affirmative Action Was Banned

Peer reviewed

Direct link

Donald Wittman – Educational Measurement: Issues and Practice, 2024

I study student characteristics and academic performance at the University of California, where consideration of an applicant's ethnicity has been banned since 1996 and SAT scores were used in admitting students to the university until fall 2021. I show the following: (1) SAT scores were more important than high school grades in predicting…

Descriptors: College Entrance Examinations, Admission Criteria, Grade Point Average, Disproportionate Representation

Data Analysis	5
Data Collection	4
Pandemics	3
Test Items	3
Artificial Intelligence	2
Automation	2
COVID-19	2
Computer Assisted Testing	2
Data	2
Evaluation Methods	2
International Assessment	2
Scores	2
Test Validity	2
Achievement Tests	1
Adjustment (to Environment)	1
Admission Criteria	1
Affirmative Action	1
Cloze Procedure	1
College Entrance Examinations	1
Comparative Analysis	1
Comparative Testing	1
Computation	1
Context Effect	1
Correlation	1
Data Use	1
More ▼

An, Lily Shiao	1
Andrew Hoang	1
Chen Li	1
Cui, Zhongmin	1
Davis, Laurie Laughlin	1
Donald Wittman	1
Flake, Jessica K.	1
Guher Gorgun	1
Guo, Hongwen	1
Ho, Andrew Dean	1
Hongwen Guo	1
Jiahui Zhang	1
Lüdtke, Oliver	1
Mo Zhang	1
Okan Bulut	1
Paul Deane	1
Renato Miceli	1
Robitzsch, Alexander	1
Setzer, J. Carl	1
Shaw, Mairead	1
Silvia Testa	1
Sireci, Stephen G.	1
Stella Y. Kim	1
Suarez-Alvarez, Javier	1
Sungyeun Kim	1
More ▼