NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
What Works Clearinghouse Rating
Showing 1 to 15 of 17 results Save | Export
Huan Liu – ProQuest LLC, 2024
In many large-scale testing programs, examinees are frequently categorized into different performance levels. These classifications are then used to make high-stakes decisions about examinees in contexts such as in licensure, certification, and educational assessments. Numerous approaches to estimating the consistency and accuracy of this…
Descriptors: Classification, Accuracy, Item Response Theory, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Zhu, Hongyue; Jiao, Hong; Gao, Wei; Meng, Xiangbin – Journal of Educational and Behavioral Statistics, 2023
Change-point analysis (CPA) is a method for detecting abrupt changes in parameter(s) underlying a sequence of random variables. It has been applied to detect examinees' aberrant test-taking behavior by identifying abrupt test performance change. Previous studies utilized maximum likelihood estimations of ability parameters, focusing on detecting…
Descriptors: Bayesian Statistics, Test Wiseness, Behavior Problems, Reaction Time
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A. – Educational Measurement: Issues and Practice, 2021
Unforeseen complications during the administration of large-scale testing programs are inevitable and can prevent examinees from accessing all test material. For classification tests in which the primary purpose is to yield a decision, such as a pass/fail result, the current study investigated a model-based standard error approach, Bayesian…
Descriptors: High Stakes Tests, Classification, Decision Making, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Niessen, A. Susan M.; Meijer, Rob R.; Tendeiro, Jorge N. – Educational Measurement: Issues and Practice, 2019
A longstanding concern about admissions to higher education is the underprediction of female academic performance by admission test scores. One explanation for these findings is selection system bias, that is, not all relevant KSAOs that are related to academic performance and gender are included in the prediction model. One solution to this…
Descriptors: College Admission, High Stakes Tests, Gender Differences, Sampling
Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023
Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…
Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy
Jing Lu; Chun Wang; Ningzhong Shi – Grantee Submission, 2023
In high-stakes, large-scale, standardized tests with certain time limits, examinees are likely to engage in either one of the three types of behavior (e.g., van der Linden & Guo, 2008; Wang & Xu, 2015): solution behavior, rapid guessing behavior, and cheating behavior. Oftentimes examinees do not always solve all items due to various…
Descriptors: High Stakes Tests, Standardized Tests, Guessing (Tests), Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Hung-Yu – Educational and Psychological Measurement, 2017
Mixture item response theory (IRT) models have been suggested as an efficient method of detecting the different response patterns derived from latent classes when developing a test. In testing situations, multiple latent traits measured by a battery of tests can exhibit a higher-order structure, and mixtures of latent classes may occur on…
Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Ng, Hui Leng; Koretz, Daniel – Applied Measurement in Education, 2015
Policymakers usually leave decisions about scaling the scores used for accountability to their appointed technical advisory committees and the testing contractors. However, scaling decisions can have an appreciable impact on school ratings. Using middle-school data from New York State, we examined the consistency of school ratings based on two…
Descriptors: School Effectiveness, Scaling, Middle Schools, Accountability
Peer reviewed Peer reviewed
Direct linkDirect link
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Petscher, Yaacov; Kershaw, Sarah; Koon, Sharon; Foorman, Barbara R. – Regional Educational Laboratory Southeast, 2014
Districts and schools use progress monitoring to assess student progress, to identify students who fail to respond to intervention, and to further adapt instruction to student needs. Researchers and practitioners often use progress monitoring data to estimate student achievement growth (slope) and evaluate changes in performance over time for…
Descriptors: Reading Comprehension, Reading Achievement, Elementary School Students, Secondary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Petscher, Yaacov; Kershaw, Sarah; Koon, Sharon; Foorman, Barbara R. – Regional Educational Laboratory Southeast, 2014
Districts and schools use progress monitoring to assess student progress, to identify students who fail to respond to intervention, and to further adapt instruction to student needs. Researchers and practitioners often use progress monitoring data to estimate student achievement growth (slope) and evaluate changes in performance over time for…
Descriptors: Response to Intervention, Achievement Gains, High Stakes Tests, Prediction
Peer reviewed Peer reviewed
Direct linkDirect link
Hughes, Jan N.; Chen, Qi; Thoemmes, Felix; Kwok, Oi-man – Educational Evaluation and Policy Analysis, 2010
The association between grade retention in first grade and passing the third grade state accountability tests, the Texas Assessment of Knowledge and Skills (TAKS) reading and math, was investigated in a sample of 769 students who were recruited into the study when they were in first grade. Of these 769 students, 165 were retained in first grade…
Descriptors: Grade Repetition, Mathematics Tests, High Stakes Tests, Grade 3
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, David R.; Thurlow, Martha L.; Stout, Karen Evans; Mavis, Ann – Journal of Special Education Leadership, 2007
In response to public demands for better-quality high school graduates and to requirements of No Child Left Behind legislation, states have developed a variety of policies such as high-stakes exit exams and diploma options. Additionally, under the Individuals with Disabilities Education Act Amendments of 1997, students with disabilities must be…
Descriptors: Federal Legislation, High Stakes Tests, High School Graduates, Exit Examinations
Williamson, David M.; Johnson, Matthew S.; Sinharay, Sandip; Bejar, Isaac I. – 2002
This paper explores the application of a technique for hierarchical item response theory (IRT) calibration of complex constructed response tasks that has promise both as a calibration tool and as a means of evaluating the isomorphic equivalence of complex constructed response tasks. Isomorphic tasks are explicitly and rigorously designed to be…
Descriptors: Bayesian Statistics, Constructed Response, Estimation (Mathematics), Evaluation Methods
Previous Page | Next Page ยป
Pages: 1  |  2