Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 5 |
| Since 2017 (last 10 years) | 14 |
Descriptor
| Data Collection | 6 |
| Scores | 6 |
| Data Analysis | 4 |
| Evaluation Methods | 4 |
| Test Construction | 4 |
| Item Response Theory | 3 |
| Protocol Analysis | 3 |
| Test Validity | 3 |
| Automation | 2 |
| Bayesian Statistics | 2 |
| Cognitive Processes | 2 |
| More ▼ | |
Source
| Applied Measurement in… | 14 |
Author
| Bostic, Jonathan David | 2 |
| Allen, Jeff | 1 |
| Andersen, Øistein E. | 1 |
| Benjamin Lugu | 1 |
| Brian F. French | 1 |
| Briscoe, Ted | 1 |
| Carol Eckerly | 1 |
| Dadey, Nathan | 1 |
| DePascale, Charles | 1 |
| Deane, Paul | 1 |
| Geranpayeh, Ardeshir | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 14 |
| Reports - Research | 10 |
| Reports - Descriptive | 3 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
| Reports - Evaluative | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Secondary Education | 3 |
| Grade 11 | 2 |
| High Schools | 2 |
| Elementary Education | 1 |
| Grade 10 | 1 |
| Grade 9 | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
Audience
Location
| Europe | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Han, Yuting; Wilson, Mark – Applied Measurement in Education, 2022
A technology-based problem-solving test can automatically capture all the actions of students when they complete tasks and save them as process data. Response sequences are the external manifestations of the latent intellectual activities of the students, and it contains rich information about students' abilities and different problem-solving…
Descriptors: Technology Uses in Education, Problem Solving, 21st Century Skills, Evaluation Methods
Stefanie A. Wind; Benjamin Lugu – Applied Measurement in Education, 2024
Researchers who use measurement models for evaluation purposes often select models with stringent requirements, such as Rasch models, which are parametric. Mokken Scale Analysis (MSA) offers a theory-driven nonparametric modeling approach that may be more appropriate for some measurement applications. Researchers have discussed using MSA as a…
Descriptors: Item Response Theory, Data Analysis, Simulation, Nonparametric Statistics
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019
Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…
Descriptors: Scores, Prediction, Writing Processes, Data Analysis
Leighton, Jacqueline P. – Applied Measurement in Education, 2021
The objective of this paper is to comment on the think-aloud methods presented in the three papers included in this special issue. The commentary offered stems from the author's own psychological investigations of unobservable information processes and the conditions under which the most defensible claims can be advanced. The structure of this…
Descriptors: Protocol Analysis, Data Collection, Test Construction, Test Validity
van Alphen, Thijmen; Jak, Suzanne; Jansen in de Wal, Joost; Schuitema, Jaap; Peetsma, Thea – Applied Measurement in Education, 2022
Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on…
Descriptors: Test Reliability, Measures (Individuals), Anxiety, Data Collection
Bostic, Jonathan David – Applied Measurement in Education, 2021
Think alouds are valuable tools for academicians, test developers, and practitioners as they provide a unique window into a respondent's thinking during an assessment. The purpose of this special issue is to highlight novel ways to use think alouds as a means to gather evidence about respondents' thinking. An intended outcome from this special…
Descriptors: Protocol Analysis, Cognitive Processes, Data Collection, STEM Education
Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data
Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024
Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…
Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests
Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021
Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…
Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Pan, Tianshu; Yin, Yue – Applied Measurement in Education, 2017
In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…
Descriptors: Bayesian Statistics, Goodness of Fit, Item Response Theory, Monte Carlo Methods
Yannakoudakis, Helen; Andersen, Øistein E.; Geranpayeh, Ardeshir; Briscoe, Ted; Nicholls, Diane – Applied Measurement in Education, 2018
There are quite a few challenges in the development of an automated writing placement model for non-native English learners, among them the fact that exams that encompass the full range of language proficiency exhibited at different stages of learning are hard to design. However, acquisition of appropriate training data that are relevant to the…
Descriptors: Automation, Data Processing, Student Placement, English Language Learners
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Allen, Jeff – Applied Measurement in Education, 2017
Using a sample of schools testing annually in grades 9-11 with a vertically linked series of assessments, a latent growth curve model is used to model test scores with student intercepts and slopes nested within school. Missed assessments can occur because of student mobility, student dropout, absenteeism, and other reasons. Missing data…
Descriptors: Achievement Gains, Academic Achievement, Growth Models, Scores

Peer reviewed
Direct link
