NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018
The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…
Descriptors: Test Content, Difficulty Level, Test Items, Test Construction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Nguyen, Huu Thanh Minh – International Journal of Language Testing, 2022
Combining Bachman's (1990) conceptualization of content validity with Messick's (1989) unifying model of construct validity, this study attempted to fill the gap in researching content validation in the context of a university reading achievement test by (a) examining traditional content validity evidence and (b) analyzing the test scores to…
Descriptors: Reading Achievement, Reading Tests, Achievement Tests, Majors (Students)
Choi, Kilchan; Kao, Jenny C.; Rivera, Nichole M.; Cai, Li – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2018
This report is the third in a series considering career-readiness features within high school assessments. The goal of this study was to explore international comparisons by applying feature analysis to Korean assessment items. Twenty math test items from the Gyeonggi Province in South Korea along with performance data from roughly 4,000 Grade 12…
Descriptors: Career Readiness, High School Students, Cross Cultural Studies, Test Items
Shin, Chingwei David; Chien, Yuehmei; Way, Walter Denny – Pearson, 2012
Content balancing is one of the most important components in the computerized adaptive testing (CAT) especially in the K to 12 large scale tests that complex constraint structure is required to cover a broad spectrum of content. The purpose of this study is to compare the weighted penalty model (WPM) and the weighted deviation method (WDM) under…
Descriptors: Computer Assisted Testing, Elementary Secondary Education, Test Content, Models
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deane, Paul; Gurevich, Olga – ETS Research Report Series, 2008
For many purposes, it is useful to collect a corpus of texts all produced to the same stimulus, whether to measure performance (as on a test) or to test hypotheses about population differences. This paper examines several methods for measuring similarities in phrasing and content and demonstrates that these methods can be used to identify…
Descriptors: Test Content, Computational Linguistics, Native Speakers, Writing Tests
Dorans, Neil J. – College Entrance Examination Board, 2000
Distinctions were made between three classes of statistical linkage: equivalence, concordance, and prediction. These distinctions were based on rational content considerations and empirical statistical relationships. A large database involving SAT I and ACT scores was used to determine which type of linkage was best suited for different scores and…
Descriptors: Statistical Analysis, Prediction, Scores, Standardized Tests
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2006
Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Item Banks
Werner, Eric – 1991
Credentialing agencies that use multi-test examinations (MTEs) should be concerned with the quality of pass/fail outcomes as well as with the proportion of candidates passing. This study addressed three questions: (1) the effects on passing rate of MTE reliability, number of tests comprising a MTE, and correlations between pairs of component…
Descriptors: Accrediting Agencies, Agency Role, Certification, Comparative Analysis
Baghi, Heibatollah; And Others – 1995
Issues related to linking tests with constructed response items were explored, specifically by comparing single-group and anchor-test designs to link raw scores from alternate forms of performance-based student assessments in the context of Delaware's assessment program using performance-based assessment. This study explored use of the two test…
Descriptors: Comparative Analysis, Constructed Response, Correlation, Educational Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Horkay, Nancy; Bennett, Randy Elliott; Allen, Nancy; Kaplan, Bruce; Yan, Fred – Journal of Technology, Learning, and Assessment, 2006
This study investigated the comparability of scores for paper and computer versions of a writing test administered to eighth grade students. Two essay prompts were given on paper to a nationally representative sample as part of the 2002 main NAEP writing assessment. The same two essay prompts were subsequently administered on computer to a second…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Program Effectiveness