ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	4

Source

Journal of Educational and…	3
Applied Psychological…	2
Grantee Submission	2
ETS Research Report Series	1

Author

Johnson, Matthew S.	10
Sinharay, Sandip	8
Williamson, David M.	2
Bejar, Isaac I.	1
Jenkins, Frank	1
Junker, Brian W.	1
Stern, Hal S.	1

Publication Type

Reports - Research	7
Journal Articles	6
Reports - Descriptive	2
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Johnson, Matthew S. X

Showing all 10 results Save | Export

The Use of the Posterior Probability in Score Differencing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Journal of Educational and Behavioral Statistics, 2021

Score differencing is one of the six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…

Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis

The Use of the Posterior Probability in Score Differencing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2021

Score differencing is one of six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…

Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis

Detecting Test Fraud Using Bayes Factors

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019

According to Wollack and Schoenig (2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggested the use of Bayes factors (e.g., Kass & Raftery, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist…

Descriptors: Cheating, Deception, Statistical Analysis, Bayesian Statistics

Calibration of Polytomous Item Families Using Bayesian Hierarchical Modeling

Peer reviewed

Direct link

Johnson, Matthew S.; Sinharay, Sandip – Applied Psychological Measurement, 2005

For complex educational assessments, there is an increasing use of item families, which are groups of related items. Calibration or scoring in an assessment involving item families requires models that can take into account the dependence structure inherent among the items that belong to the same item family. This article extends earlier works in…

Descriptors: National Competency Tests, Markov Processes, Bayesian Statistics

Posterior Predictive Assessment of Item Response Theory Models

Peer reviewed

Direct link

Sinharay, Sandip; Johnson, Matthew S.; Stern, Hal S. – Applied Psychological Measurement, 2006

Model checking in item response theory (IRT) is an underdeveloped area. There is no universally accepted tool for checking IRT models. The posterior predictive model-checking method is a popular Bayesian model-checking tool because it has intuitive appeal, is simple to apply, has a strong theoretical basis, and can provide graphical or numerical…

Descriptors: Predictive Measurement, Item Response Theory, Bayesian Statistics, Models

Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions

Peer reviewed

Direct link

Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M. – Journal of Educational and Behavioral Statistics, 2003

Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…

Descriptors: Test Items, Markov Processes, Educational Testing, Probability

Calibration of Automatically Generated Items Using Bayesian Hierarchical Modeling.

Download full text

Johnson, Matthew S.; Sinharay, Sandip – 2003

For complex educational assessments, there is an increasing use of "item families," which are groups of related items. However, calibration or scoring for such an assessment requires fitting models that take into account the dependence structure inherent among the items that belong to the same item family. C. Glas and W. van der Linden…

Descriptors: Bayesian Statistics, Constructed Response, Educational Assessment, Estimation (Mathematics)

Hierarchical IRT Examination of Isomorphic Equivalence of Complex Constructed Response Tasks.

Download full text

Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I. – 2002

This paper explores the application of a technique for hierarchical item response theory (IRT) calibration of complex constructed response tasks that has promise both as a calibration tool and as a means of evaluating the isomorphic equivalence of complex constructed response tasks. Isomorphic tasks are explicitly and rigorously designed to be…

Descriptors: Bayesian Statistics, Constructed Response, Estimation (Mathematics), Evaluation Methods

Using Data Augmentation and Markov Chain Monte Carlo for the Estimation of Unfolding Response Models

Peer reviewed

Direct link

Johnson, Matthew S.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2003

Unfolding response models, a class of item response theory (IRT) models that assume a unimodal item response function (IRF), are often used for the measurement of attitudes. Verhelst and Verstralen (1993)and Andrich and Luo (1993) independently developed unfolding response models by relating the observed responses to a more common monotone IRT…

Descriptors: Markov Processes, Item Response Theory, Computation, Data Analysis

A Bayesian Hierarchical Model for Large-Scale Educational Surveys: An Application to the National Assessment of Educational Progress. Research Report. ETS RR-04-38

Peer reviewed
PDF on ERIC

Download full text

Johnson, Matthew S.; Jenkins, Frank – ETS Research Report Series, 2005

Large-scale educational assessments such as the National Assessment of Educational Progress (NAEP) sample examinees to whom an exam will be administered. In most situations the sampling design is not a simple random sample and must be accounted for in the estimating model. After reviewing the current operational estimation procedure for NAEP, this…

Descriptors: Bayesian Statistics, Hierarchical Linear Modeling, National Competency Tests, Sampling

Bayesian Statistics	10
Markov Processes	6
Item Response Theory	4
Monte Carlo Methods	4
Probability	4
Cheating	3
Identification	3
Statistical Analysis	3
Achievement Gains	2
Computation	2
Constructed Response	2
Estimation (Mathematics)	2
Measurement Techniques	2
Models	2
Multiple Choice Tests	2
National Competency Tests	2
Test Items	2
Attitude Measures	1
Comparative Analysis	1
Correlation	1
Data Analysis	1
Deception	1
Educational Assessment	1
Educational Testing	1
Evaluation Methods	1
More ▼