NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers15
Practitioners1
Students1
Teachers1
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Kritika Thapa – ProQuest LLC, 2023
Measurement invariance is crucial for making valid comparisons across different groups (Kline, 2016; Vandenberg, 2002). To address the challenges associated with invariance testing such as large sample size requirements, the complexity of the model, etc., applied researchers have incorporated parcels. Parcels have been shown to alleviate skewness,…
Descriptors: Elementary Secondary Education, Achievement Tests, Foreign Countries, International Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Ockey, Gary J.; Wagner, Elvis – Language Learning & Language Teaching, 2018
This book is relevant for language testers, listening researchers, and oral proficiency teachers, in that it explores four broad themes related to the assessment of L2 listening ability: the use of authentic, real-world spoken texts; the effects of different speech varieties of listening inputs; the use of audio-visual texts; and assessing…
Descriptors: Listening Comprehension, Second Language Learning, Second Language Instruction, Listening Comprehension Tests
Nering, Michael L., Ed.; Ostini, Remo, Ed. – Routledge, Taylor & Francis Group, 2010
This comprehensive "Handbook" focuses on the most used polytomous item response theory (IRT) models. These models help us understand the interaction between examinees and test questions where the questions have various response categories. The book reviews all of the major models and includes discussions about how and where the models…
Descriptors: Guides, Item Response Theory, Test Items, Correlation
Chevalaz, Gerard M.; Tatsuoka, Kikumi K. – 1983
Two order theoretic techniques were presented and compared. Ordering theory of Krus and Bart (1974) and an extended Takeya's item relational structure analysis (IRS) by Tatsuoka and Tatsuoka (1981) were used to extract the hierarchical item structure from three datasets. Directed graphs were constructed and both methods were assessed as to how…
Descriptors: Comparative Analysis, Computer Simulation, Instructional Design, Item Analysis
Subkoviak, Michael J.; Harris, Deborah J. – 1984
This study examined three statistical methods for selecting items for mastery tests. One is the pretest-posttest method due to Cox and Vargas (1966); it is computationally simple, but has a number of serious limitations. The second is a latent trait method recommended by van der Linden (1981); it is computationally complex, but has a number of…
Descriptors: Comparative Analysis, Elementary Secondary Education, Item Analysis, Latent Trait Theory
Holland, Paul W.; Thayer, Dorothy T. – 1986
The Mantel-Haenszel procedure (MH) is a practical, inexpensive, and powerful way to detect test items that function differently in two groups of examinees. MH is a natural outgrowth of previously suggested chi square methods, and it is also related to methods based on item response theory. The study of items that function differently for two…
Descriptors: Comparative Analysis, Hypothesis Testing, Item Analysis, Latent Trait Theory
Skaggs, Gary; Stevenson, Jose – 1986
This study assesses the accuracy of ASCAL, a microcomputer-based program for estimating item parameters for the three-parameter logistic model in item response theory. Item responses are generated from a three-parameter model, and item parameter estimates from ASCAL are compared to the generating item parameters and to estimates produced by…
Descriptors: Algorithms, Comparative Analysis, Computer Software, Estimation (Mathematics)
Peer reviewed Peer reviewed
Jacobs, Stanley S. – Research in Higher Education, 1995
Comparison of college freshman performance on two different forms of the California Critical Thinking Skills Test (n=684, 692) found a lack of equivalence between forms and low internal consistency reliability. It is suggested that, although the test may be useful for research, it is not appropriate for decision making about individual students.…
Descriptors: College Freshmen, Comparative Analysis, Critical Thinking, Educational Research
Kingsbury, G. Gage – 1985
A procedure for assessing content-area and total-test dimensionality which uses response function discrepancies (RFD) was studied. Three different versions of the RFD procedure were compared to Bejar's principal axis content-area procedure and Indow and Samejima's exploratory factor analytic technique. The procedures were compared in terms of the…
Descriptors: Achievement Tests, Comparative Analysis, Elementary Education, Estimation (Mathematics)
Muraki, Eiji – 1984
The TESTFACT computer program and full-information factor analysis of test items were used in a computer simulation conducted to correct for the guessing effect. Full-information factor analysis also corrects for omitted items. The present version of TESTFACT handles up to five factors and 150 items. A preliminary smoothing of the tetrachoric…
Descriptors: Comparative Analysis, Computer Simulation, Computer Software, Correlation
Marco, Gary L.; And Others – 1985
Three item response models were evaluated for estimating item parameters and equating test scores. The models, which approximated the traditional three-parameter model, included: (1) the Rasch one-parameter model, operationalized in the BICAL computer program; (2) an approximate three-parameter logistic model based on coarse group data divided…
Descriptors: College Entrance Examinations, Comparative Analysis, Computer Software, Equated Scores
Buhr, Dianne C.; Algina, James – 1986
The focus of this study is on the estimation procedures implemented in BILOG, a computer program. One purpose is to compare the item parameter estimates produced by various procedures available in BILOG. Four different models are used: the one, two, and three parameter model and a three parameter model with common guessing parameters. The results…
Descriptors: Ability, Bayesian Statistics, Comparative Analysis, Computer Oriented Programs
Hambleton, Ronald K.; And Others – 1987
The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…
Descriptors: Comparative Analysis, Content Validity, Cutting Scores, Difficulty Level
Phillips, Gary W. – 1982
This paper presents an introduction to the use of latent trait models for the estimation of domain scores. It was shown that these models provided an advantage over classical test theory and binomial error models in that unbiased estimates of true domain scores could be obtained even when items were not randomly selected from a universe of items.…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Estimation (Mathematics), Goodness of Fit
Sarvela, Paul D. – 1986
Four discrimination indices were compared, using score distributions which were normal, bimodal, and negatively skewed. The score distributions were systematically varied to represent the common circumstances of a military training situation using criterion-referenced mastery tests. Three 20-item tests were administered to 110 simulated subjects.…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Analysis, Mastery Tests