NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…
Descriptors: Scores, Test Validity, Test Items, Prior Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016
The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…
Descriptors: Test Reliability, Goodness of Fit, Scores, Patients
Peer reviewed Peer reviewed
Direct linkDirect link
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Peer reviewed Peer reviewed
Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979
Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)
Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction
Reckase, Mark D. – 1978
Five comparisons were made relative to the quality of estimates of ability parameters and item calibrations obtained from the one-parameter and three-parameter logistic models. The results indicate: (1) The three-parameter model fit the test data better in all cases than did the one-parameter model. For simulation data sets, multi-factor data were…
Descriptors: Comparative Analysis, Goodness of Fit, Item Analysis, Mathematical Models
Pine, Steven M.; Weiss, David J. – 1978
This report examines how selection fairness is influenced by the characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, degree of item bias, and testing strategy. Computer simulation was used in the administration of either a conventional or Bayesian adaptive ability test to a…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Testing, Computer Assisted Testing
Westers, Paul – 1993
The subject of this dissertation is the examination of differential item functioning (DIF) through the use of loglinear Rasch models with latent classes. DIF refers to the probability that a correct response among equally able test takers is different for various racial, ethnic, and gender groups. Because usual methods of detecting DIF give little…
Descriptors: Ability, Estimation (Mathematics), Ethnic Groups, Foreign Countries
Merz, William R.; Grossen, Neal E. – 1978
Six approaches to assessing test item bias were examined: transformed item difficulty, point biserial correlations, chi-square, factor analysis, one parameter item characteristic curve, and three parameter item characteristic curve. Data sets for analysis were generated by a Monte Carlo technique based on the three parameter model; thus, four…
Descriptors: Difficulty Level, Evaluation Methods, Factor Analysis, Item Analysis
Pine, Steven M.; Weiss, David J. – 1976
This report examines how selection fairness is influenced by the item characteristics of a selection instrument in terms of its distribution of item difficulties, level of item discrimination, and degree of item bias. Computer simulation was used in the administration of conventional ability tests to a hypothetical target population consisting of…
Descriptors: Aptitude Tests, Bias, Computer Programs, Culture Fair Tests
PDF pending restoration PDF pending restoration
Civil Service Commission, Washington, DC. Personnel Research and Development Center. – 1976
This pamphlet reprints three papers and an invited discussion of them, read at a Division 5 Symposium at the 1975 American Psychological Association Convention. The first paper describes a Bayesian tailored testing process and shows how it demonstrates the importance of using test items with high discrimination, low guessing probability, and a…
Descriptors: Adaptive Testing, Bayesian Statistics, Computer Oriented Programs, Computer Programs
Previous Page | Next Page »
Pages: 1  |  2