NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 18 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025
This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…
Descriptors: Generalizability Theory, Automation, Test Items, Students
Peer reviewed Peer reviewed
Direct linkDirect link
Silvia Testa; Renato Miceli; Renato Miceli – Educational Measurement: Issues and Practice, 2025
Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE--that may only be used when the individuals taking the two tests come from the same…
Descriptors: Comparative Testing, Heuristics, Problem Solving, Personality Traits
Peer reviewed Peer reviewed
Direct linkDirect link
Jiahui Zhang; William H. Schmidt – Educational Measurement: Issues and Practice, 2024
Measuring opportunities to learn (OTL) is crucial for evaluating education quality and equity, but obtaining accurate and comprehensive OTL data at a large scale remains challenging. We attempt to address this issue by investigating measurement concerns in data collection and sampling. With the primary goal of estimating group-level OTLs for large…
Descriptors: Educational Opportunities, Measurement Techniques, Data Collection, Grade 4
Peer reviewed Peer reviewed
Direct linkDirect link
Mo Zhang; Paul Deane; Andrew Hoang; Hongwen Guo; Chen Li – Educational Measurement: Issues and Practice, 2025
In this paper, we describe two empirical studies that demonstrate the application and modeling of keystroke logs in writing assessments. We illustrate two different approaches of modeling differences in writing processes: analysis of mean differences in handcrafted theory-driven features and use of large language models to identify stable personal…
Descriptors: Writing Tests, Computer Assisted Testing, Keyboarding (Data Entry), Writing Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Voss, Nathaniel M.; Vangsness, Lisa – Educational Measurement: Issues and Practice, 2020
While it is easy to assume that university students who wait until the last minute to complete surveys for their class research requirements provide low-quality data, this issue has not been empirically examined. The goal of the present study was to examine the relation between student research procrastination and two important data quality…
Descriptors: Time Management, College Students, Data Collection, Student Surveys
Peer reviewed Peer reviewed
Direct linkDirect link
Ulitzsch, Esther; Lüdtke, Oliver; Robitzsch, Alexander – Educational Measurement: Issues and Practice, 2023
Country differences in response styles (RS) may jeopardize cross-country comparability of Likert-type scales. When adjusting for rather than investigating RS is the primary goal, it seems advantageous to impose minimal assumptions on RS structures and leverage information from multiple scales for RS measurement. Using PISA 2015 background…
Descriptors: Response Style (Tests), Comparative Analysis, Achievement Tests, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2021
Technical difficulties occasionally lead to missing item scores and hence to incomplete data on computerized tests. It is not straightforward to report scores to the examinees whose data are incomplete due to technical difficulties. Such reporting essentially involves imputation of missing scores. In this paper, a simulation study based on data…
Descriptors: Data Analysis, Scores, Educational Assessment, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Baron, Patricia; Sireci, Stephen G.; Slater, Sharon C. – Educational Measurement: Issues and Practice, 2021
Since the No Child Left Behind Act (No Child Left Behind [NCLB], 2001) was enacted, the Bookmark method has been used in many state standard setting studies (Karantonis and Sireci; Zieky, Perie, and Livingston). The purpose of the current study is to evaluate the criticism that when panelists are presented with data during the Bookmark standard…
Descriptors: State Standards, Standard Setting, Evaluators, Training
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen – Educational Measurement: Issues and Practice, 2022
Many educational summative and formative assessments have been transferred to a remote online setting because of the pandemic. Educational professionals and stakeholders have shown interest in learning how this change in the test mode influenced test takers; that is, whether test-taking experiences in a remote test setting were different from…
Descriptors: Distance Education, Educational Assessment, Student Evaluation, Summative Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Charlie; Chajewski, Michael; Rupp, André A. – Educational Measurement: Issues and Practice, 2018
In this ITEMS module, we provide a two-part introduction to the topic of reliability from the perspective of "classical test theory" (CTT). In the first part, which is directed primarily at beginning learners, we review and build on the content presented in the original didactic ITEMS article by Traub and Rowley (1991). Specifically, we…
Descriptors: Test Reliability, Test Theory, Computation, Data Collection
Peer reviewed Peer reviewed
Direct linkDirect link
Polikoff, Morgan S.; Gasparian, Hovanes; Korn, Shira; Gamboa, Martin; Porter, Andrew C.; Smith, Toni; Garet, Michael S. – Educational Measurement: Issues and Practice, 2020
As the standards movement continues into its third decade, there remains a need for alignment methodologies that can be broadly applied to study instruction and policy. This article reports on a series of development efforts meant to revise the Surveys of Enacted Curriculum (SEC) surveys and methods to study the implementation of new college- and…
Descriptors: Alignment (Education), Surveys, College Readiness, Career Readiness
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A.; Jurich, Daniel P. – Educational Measurement: Issues and Practice, 2017
Recent research has proposed a criterion to evaluate the reportability of subscores. This criterion is a value-added ratio ("VAR"), where values greater than 1 suggest that the true subscore is better approximated by the observed subscore than by the total score. This research extends the existing literature by quantifying statistical…
Descriptors: Guidelines, Scores, Research Reports, Value Added Models
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Mo; Bennett, Randy E.; Deane, Paul; van Rijn, Peter W. – Educational Measurement: Issues and Practice, 2019
This study compared gender groups on the processes used in writing essays in an online assessment. Middle-school students from four grades responded to essays in two persuasive subgenres, argumentation and policy recommendation. Writing processes were inferred from four indicators extracted from students' keystroke logs. In comparison to males, on…
Descriptors: Gender Differences, Essays, Computer Assisted Testing, Persuasive Discourse
Previous Page | Next Page »
Pages: 1  |  2