NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 169 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Suthathip Thirakunkovit – Language Testing in Asia, 2025
Establishing a cut score is a crucial aspect of the test development process since the selected cut score has the potential to impact students' performance outcomes and shape instructional strategies within the classroom. Therefore, it is vital for those involved in test development to set a cut score that is both fair and justifiable. This cut…
Descriptors: Cutting Scores, Culture Fair Tests, Language Tests, Test Construction
Yixi Wang – ProQuest LLC, 2020
Binary item response theory (IRT) models are widely used in educational testing data. These models are not perfect because they simplify the individual item responding process, ignore the differences among different response patterns, cannot handle multidimensionality that lay behind options within a single item, and cannot manage missing response…
Descriptors: Item Response Theory, Educational Testing, Data, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Ling, Guangming – International Journal of Testing, 2016
To investigate possible iPad related mode effect, we tested 403 8th graders in Indiana, Maryland, and New Jersey under three mode conditions through random assignment: a desktop computer, an iPad alone, and an iPad with an external keyboard. All students had used an iPad or computer for six months or longer. The 2-hour test included reading, math,…
Descriptors: Educational Testing, Computer Assisted Testing, Handheld Devices, Computers
Peer reviewed Peer reviewed
Direct linkDirect link
Baker, Eva L. – Educational Researcher, 2016
This article investigates the persistent and change elements of educational testing and assessment from 1920 to the present day. I show by examining the addresses and texts of American Educational Research Association presidents a continuing focus on schools, from early experiments and development up through applications in accountability systems.…
Descriptors: Research, Educational Testing, Presidents, Professional Associations
Berman, Amy I.; Haertel, Edward H.; Pellegrino, James W. – National Academy of Education, 2020
This National Academy of Education (NAEd) volume provides guidance to key stakeholders on how to accurately report and interpret comparability assertions concerning large-scale educational assessments as well as how to ensure greater comparability by paying close attention to key aspects of assessment design, content, and procedures. The goal of…
Descriptors: Educational Assessment, Educational Testing, Scores, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Veldkamp, Bernard P. – Journal of Educational Measurement, 2016
Many standardized tests are now administered via computer rather than paper-and-pencil format. The computer-based delivery mode brings with it certain advantages. One advantage is the ability to adapt the difficulty level of the test to the ability level of the test taker in what has been termed computerized adaptive testing (CAT). A second…
Descriptors: Computer Assisted Testing, Reaction Time, Standardized Tests, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zaromb, Franklin; Adler, Rachel M.; Bruce, Kelly; Attali, Yigal; Rock, JoAnn – ETS Research Report Series, 2014
This study investigates the benefits of no-stakes educational testing during students' summer vacation as a strategy to mitigate summer learning loss. Fifty-one students in Grades 3-8 from the Every Child Valued (ECV) and Lawrence Community Center (LCC) summer programs in Lawrenceville, NJ, took short, online assessments throughout the summer,…
Descriptors: Educational Testing, Summer Programs, Grade 3, Grade 4
Peer reviewed Peer reviewed
Direct linkDirect link
Embretson, Susan E.; Yang, Xiangdong – Psychometrika, 2013
This paper presents a noncompensatory latent trait model, the multicomponent latent trait model for diagnosis (MLTM-D), for cognitive diagnosis. In MLTM-D, a hierarchical relationship between components and attributes is specified to be applicable to permit diagnosis at two levels. MLTM-D is a generalization of the multicomponent latent trait…
Descriptors: Mathematics Achievement, Achievement Tests, Item Response Theory, Measurement
Hixson, Nate; Rhudy, Vaughn – West Virginia Department of Education, 2013
Student responses to the West Virginia Educational Standards Test (WESTEST) 2 Online Writing Assessment are scored by a computer-scoring engine. The scoring method is not widely understood among educators, and there exists a misperception that it is not comparable to hand scoring. To address these issues, the West Virginia Department of Education…
Descriptors: Scoring Formulas, Scoring Rubrics, Interrater Reliability, Test Scoring Machines
Peer reviewed Peer reviewed
Direct linkDirect link
Condon, William – Assessing Writing, 2013
Automated Essay Scoring (AES) has garnered a great deal of attention from the rhetoric and composition/writing studies community since the Educational Testing Service began using e-rater[R] and the "Criterion"[R] Online Writing Evaluation Service as products in scoring writing tests, and most of the responses have been negative. While the…
Descriptors: Measurement, Psychometrics, Evaluation Methods, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Bielinska-Kwapisz, Agnieszka; Brown, F. William; Semenik, Richard – Journal of Education for Business, 2012
The Major Field Test in Business (MFT-B), a standardized assessment test of business knowledge among undergraduate business seniors, is widely used to measure student achievement. The Educational Testing Service, publisher of the assessment, provides data that allow institutions to compare their own MFT-B performance to national norms, but that…
Descriptors: Educational Testing, Academic Achievement, Field Tests, National Norms
Young, John W.; Holtzman, Steven; Steinberg, Jonathan – Educational Testing Service, 2011
In this research investigation of score comparability for language minority students (English language learners [ELLs] and former English language learners), we examined 3 indicators of score comparability (reliability, internal test structure, and differential item functioning) for 4th and 8th grade students who took the NCLB-mandated content…
Descriptors: Language Minorities, Second Language Learning, Grade 8, Minority Group Students
Tian, Feng – ProQuest LLC, 2011
There has been a steady increase in the use of mixed-format tests, that is, tests consisting of both multiple-choice items and constructed-response items in both classroom and large-scale assessments. This calls for appropriate equating methods for such tests. As Item Response Theory (IRT) has rapidly become mainstream as the theoretical basis for…
Descriptors: Item Response Theory, Comparative Analysis, Equated Scores, Statistical Analysis
Snyder, James – ProQuest LLC, 2010
This dissertation research examined the changes in item RIT calibration that occurred when adding audio to a set of currently calibrated RIT items and then placing these new items as field test items in the modified assessments on the NWEA MAP test platform. The researcher used test results from over 600 students in the Poway School District in…
Descriptors: Test Results, Test Items, Field Tests, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Gur, Bekir S.; Celik, Zafer; Ozoglu, Murat – Journal of Education Policy, 2012
In this article we provide a critique of the interpretation and utilization of Programme for International Student Assessment (PISA) results by the National Education Authorities in Turkey. First, we define and explain what OECD's PISA is. Second, we make an overview of the media coverage in Turkey of the PISA 2003 and 2006 results. Third, we…
Descriptors: Foreign Countries, Curriculum Development, Educational Quality, News Reporting
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  12