Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 5 |
| Since 2017 (last 10 years) | 14 |
| Since 2007 (last 20 years) | 26 |
Descriptor
| Data Analysis | 12 |
| Data Collection | 8 |
| Evaluation Methods | 8 |
| Scores | 8 |
| Item Response Theory | 6 |
| Test Construction | 6 |
| Scoring | 5 |
| Simulation | 5 |
| Error of Measurement | 4 |
| Problem Solving | 4 |
| Test Bias | 4 |
| More ▼ | |
Source
| Applied Measurement in… | 26 |
Author
| Bostic, Jonathan David | 2 |
| Cline, Frederick | 2 |
| Finch, Holmes | 2 |
| Allen, Jeff | 1 |
| Andersen, Øistein E. | 1 |
| Benjamin Lugu | 1 |
| Brian F. French | 1 |
| Bridgeman, Brent | 1 |
| Briscoe, Ted | 1 |
| Burton, Nancy | 1 |
| Béguin, Anton A. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 26 |
| Reports - Research | 18 |
| Reports - Evaluative | 5 |
| Reports - Descriptive | 3 |
| Tests/Questionnaires | 2 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
Education Level
| Secondary Education | 3 |
| Elementary Education | 2 |
| Grade 11 | 2 |
| High Schools | 2 |
| Grade 10 | 1 |
| Grade 4 | 1 |
| Grade 9 | 1 |
| Higher Education | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
| More ▼ | |
Audience
| Researchers | 1 |
Location
| Europe | 1 |
| Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Han, Yuting; Wilson, Mark – Applied Measurement in Education, 2022
A technology-based problem-solving test can automatically capture all the actions of students when they complete tasks and save them as process data. Response sequences are the external manifestations of the latent intellectual activities of the students, and it contains rich information about students' abilities and different problem-solving…
Descriptors: Technology Uses in Education, Problem Solving, 21st Century Skills, Evaluation Methods
Stefanie A. Wind; Benjamin Lugu – Applied Measurement in Education, 2024
Researchers who use measurement models for evaluation purposes often select models with stringent requirements, such as Rasch models, which are parametric. Mokken Scale Analysis (MSA) offers a theory-driven nonparametric modeling approach that may be more appropriate for some measurement applications. Researchers have discussed using MSA as a…
Descriptors: Item Response Theory, Data Analysis, Simulation, Nonparametric Statistics
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019
Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…
Descriptors: Scores, Prediction, Writing Processes, Data Analysis
Leighton, Jacqueline P. – Applied Measurement in Education, 2021
The objective of this paper is to comment on the think-aloud methods presented in the three papers included in this special issue. The commentary offered stems from the author's own psychological investigations of unobservable information processes and the conditions under which the most defensible claims can be advanced. The structure of this…
Descriptors: Protocol Analysis, Data Collection, Test Construction, Test Validity
van Alphen, Thijmen; Jak, Suzanne; Jansen in de Wal, Joost; Schuitema, Jaap; Peetsma, Thea – Applied Measurement in Education, 2022
Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on…
Descriptors: Test Reliability, Measures (Individuals), Anxiety, Data Collection
Bostic, Jonathan David – Applied Measurement in Education, 2021
Think alouds are valuable tools for academicians, test developers, and practitioners as they provide a unique window into a respondent's thinking during an assessment. The purpose of this special issue is to highlight novel ways to use think alouds as a means to gather evidence about respondents' thinking. An intended outcome from this special…
Descriptors: Protocol Analysis, Cognitive Processes, Data Collection, STEM Education
Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data
Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024
Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…
Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests
Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021
Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…
Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Pan, Tianshu; Yin, Yue – Applied Measurement in Education, 2017
In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…
Descriptors: Bayesian Statistics, Goodness of Fit, Item Response Theory, Monte Carlo Methods
Yannakoudakis, Helen; Andersen, Øistein E.; Geranpayeh, Ardeshir; Briscoe, Ted; Nicholls, Diane – Applied Measurement in Education, 2018
There are quite a few challenges in the development of an automated writing placement model for non-native English learners, among them the fact that exams that encompass the full range of language proficiency exhibited at different stages of learning are hard to design. However, acquisition of appropriate training data that are relevant to the…
Descriptors: Automation, Data Processing, Student Placement, English Language Learners
DeMars, Christine – Applied Measurement in Education, 2015
In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…
Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory
Lee, Guemin; Lee, Won-Chan – Applied Measurement in Education, 2016
The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…
Descriptors: Test Format, Multidimensional Scaling, Item Response Theory, Equated Scores
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
