NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Vida, Leonardo J.; Bolsinova, Maria; Brinkhuis, Matthieu J. S. – International Educational Data Mining Society, 2021
The quality of exams drives test-taking behavior of examinees and is a proxy for the quality of teaching. As most university exams have strict time limits, and speededness is an important measure of the cognitive state of examinees, this might be used to assess the connection between exams' quality and examinees' performance. The practice of…
Descriptors: Accuracy, Test Items, Tests, Student Behavior
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Lina Anaya; Nagore Iriberri; Pedro Rey-Biel; Gema Zamarro – Annenberg Institute for School Reform at Brown University, 2021
Standardized assessments are widely used to determine access to educational resources with important consequences for later economic outcomes in life. However, many design features of the tests themselves may lead to psychological reactions influencing performance. In particular, the level of difficulty of the earlier questions in a test may…
Descriptors: Test Construction, Test Wiseness, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Gu, Lin; Ling, Guangming; Liu, Ou Lydia; Yang, Zhitong; Li, Guirong; Kardanova, Elena; Loyalka, Prashant – Assessment & Evaluation in Higher Education, 2021
We examine the effects of computer-based versus paper-based assessment of critical thinking skills, adapted from English (in the U.S.) to Chinese. Using data collected based on a random assignment between the two modes in multiple Chinese colleges, we investigate mode effects from multiple perspectives: mean scores, measurement precision, item…
Descriptors: Critical Thinking, Tests, Test Format, Computer Assisted Testing
Haladyna, Thomas M. – IDEA Center, Inc., 2018
Writing multiple-choice test items to measure student learning in higher education is a challenge. Based on extensive scholarly research and experience, the author describes various item formats, offers guidelines for creating these items, and provides many examples of both good and bad test items. He also suggests some shortcuts for developing…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Higher Education
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Han, Kyung T.; Wells, Craig S.; Hambleton, Ronald K. – Practical Assessment, Research & Evaluation, 2015
In item response theory test scaling/equating with the three-parameter model, the scaling coefficients A and B have no impact on the c-parameter estimates of the test items since the cparameter estimates are not adjusted in the scaling/equating procedure. The main research question in this study concerned how serious the consequences would be if…
Descriptors: Item Response Theory, Monte Carlo Methods, Scaling, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Jordan, Sally – Computers & Education, 2012
Students were observed directly, in a usability laboratory, and indirectly, by means of an extensive evaluation of responses, as they attempted interactive computer-marked assessment questions that required free-text responses of up to 20 words and as they amended their responses after receiving feedback. This provided more general insight into…
Descriptors: Learner Engagement, Feedback (Response), Evaluation, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Wright, Christian D.; Eddy, Sarah L.; Wenderoth, Mary Pat; Abshire, Elizabeth; Blankenbiller, Margaret; Brownell, Sara E. – CBE - Life Sciences Education, 2016
Recent reform efforts in undergraduate biology have recommended transforming course exams to test at more cognitively challenging levels, which may mean including more cognitively challenging and more constructed-response questions on assessments. However, changing the characteristics of exams could result in bias against historically underserved…
Descriptors: Introductory Courses, Biology, Undergraduate Students, Higher Education
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Kirschner, Sophie; Borowski, Andreas; Fischer, Hans E.; Gess-Newsome, Julie; von Aufschnaiter, Claudia – International Journal of Science Education, 2016
Teachers' professional knowledge is assumed to be a key variable for effective teaching. As teacher education has the goal to enhance professional knowledge of current and future teachers, this knowledge should be described and assessed. Nevertheless, only a limited number of studies quantitatively measures physics teachers' professional…
Descriptors: Evaluation Methods, Tests, Test Format, Science Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Breakstone, Joel – Theory and Research in Social Education, 2014
This article considers the design process for new formative history assessments. Over the course of 3 years, my colleagues from the Stanford History Education Group and I designed, piloted, and revised dozens of "History Assessments of Thinking" (HATs). As we created HATs, we sought to gather information about their cognitive validity,…
Descriptors: History Instruction, Formative Evaluation, Tests, Correlation
Brese, Falk, Ed. – International Association for the Evaluation of Educational Achievement, 2012
The goal for selecting the released set of test items was to have approximately 25% of each of the full item sets for mathematics content knowledge (MCK) and mathematics pedagogical content knowledge (MPCK) that would represent the full range of difficulty, content, and item format used in the TEDS-M study. The initial step in the selection was to…
Descriptors: Preservice Teacher Education, Elementary School Teachers, Secondary School Teachers, Mathematics Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Judd, Wallace – Practical Assessment, Research & Evaluation, 2009
Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…
Descriptors: Test Items, Test Format, Evaluation, Item Analysis
College Board, 2011
This catalog lists research reports, research notes, and other publications available from the College Board's website. The catalog briefly describes research publications available free of charge. Introduced in 1981, the Research Report series includes studies and reviews in areas such as college admission, special populations, subgroup…
Descriptors: Research Reports, Publications, Educational Research, College Students
Peer reviewed Peer reviewed
Balch, William R. – Teaching of Psychology, 1989
Studies the effect of item order on test scores and completion time. Students scored slightly higher when test items were grouped sequentially (relating to text and lectures) than on tests when test items were grouped by text chapter but ordered randomly, or when test items were ordered randomly. Found no differences in completion time. (Author/LS)
Descriptors: Educational Research, Higher Education, Performance, Psychology
Previous Page | Next Page ยป
Pages: 1  |  2