NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 151 to 165 of 639 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sahin, Alper; Anil, Duygu – Educational Sciences: Theory and Practice, 2017
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
Descriptors: Test Length, Sample Size, Item Response Theory, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017
With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…
Descriptors: Anatomy, Science Tests, Test Items, Test Reliability
Wang, Keyin – ProQuest LLC, 2017
The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same…
Descriptors: Comparative Analysis, Computer Assisted Testing, Adaptive Testing, Test Items
Northwest Evaluation Association, 2017
Some schools use results from the MAP® Growth™ interim assessments from Northwest Evaluation Association® (NWEA®) in a number of high-stakes ways. These include as a component of their teacher evaluation systems, to determine whether a student advances to the next grade, or as an indicator for student readiness for certain programs or…
Descriptors: Outcome Measures, Academic Achievement, High Stakes Tests, Test Results
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, HyeSun – Applied Measurement in Education, 2018
The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…
Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016
The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…
Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement
Steinkamp, Susan Christa – ProQuest LLC, 2017
For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…
Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Galikyan, Irena; Madyarov, Irshat; Gasparyan, Rubina – ETS Research Report Series, 2019
The broad range of English language teaching and learning contexts present in the world today necessitates high quality assessment instruments that can provide reliable and meaningful information about learners' English proficiency levels to relevant stakeholders. The "TOEFL Junior"® tests were recently introduced by Educational Testing…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Student Attitudes
Peer reviewed Peer reviewed
Direct linkDirect link
Senel, Selma; Kutlu, Ömer – European Journal of Special Needs Education, 2018
This paper examines listening comprehension skills of visually impaired students (VIS) using computerised adaptive testing (CAT) and reader-assisted paper-pencil testing (raPPT) and student views about them. Explanatory mixed method design was used in this study. Sample is comprised of 51 VIS, in 7th and 8th grades. 9 of these students were…
Descriptors: Computer Assisted Testing, Adaptive Testing, Visual Impairments, Student Attitudes
Peer reviewed Peer reviewed
Direct linkDirect link
McGonagle, Katherine A.; Brown, Charles; Schoeni, Robert F. – Field Methods, 2015
Recording interviews is a key feature of quality control protocols for most survey organizations. We examine the effects on interview length and data quality of a new protocol adopted by a national panel study. The protocol recorded a randomly chosen one-third of all interviews digitally, although all respondents were asked for permission to…
Descriptors: National Surveys, Interviews, Quality Control, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Zaporozhets, Olga; Fox, Christine M.; Beltyukova, Svetlana A.; Laux, John M.; Piazza, Nick J.; Salyers, Kathleen – Measurement and Evaluation in Counseling and Development, 2015
This study was to develop a linear measure of change using University of Rhode Island Change Assessment items that represented Prochaska and DiClemente's theory. The resulting Toledo Measure of Change is short, is easy to use, and provides reliable scores for identification of individuals' stage of change and progression within that stage.
Descriptors: Item Response Theory, Change, Measures (Individuals), Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Hsu, Chia-Ling; Wang, Wen-Chung – Journal of Educational Measurement, 2015
Cognitive diagnosis models provide profile information about a set of latent binary attributes, whereas item response models yield a summary report on a latent continuous trait. To utilize the advantages of both models, higher order cognitive diagnosis models were developed in which information about both latent binary attributes and latent…
Descriptors: Computer Assisted Testing, Adaptive Testing, Models, Cognitive Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
Pages: 1  |  ...  |  7  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  ...  |  43