NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)4
Since 2007 (last 20 years)10
Assessments and Surveys
SAT (College Admission Test)1
What Works Clearinghouse Rating
Showing 1 to 15 of 20 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021
There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…
Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Ou Lydia; Rios, Joseph A.; Heilman, Michael; Gerard, Libby; Linn, Marcia C. – Journal of Research in Science Teaching, 2016
Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine…
Descriptors: Science Tests, Scoring, Automation, Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
He, Chunxiu – International Journal of Higher Education, 2019
Given the practical significance of vocabulary testing in language teaching and the theoretical foundations of developing a vocabulary test, four well-established vocabulary tests are introduced for diagnostic purpose together with their corresponding validation studies, with a focus on the designed purpose, the selection of the items, the…
Descriptors: Vocabulary Development, Language Tests, Validity, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia – European Educational Research Journal, 2017
The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…
Descriptors: Foreign Countries, Computer Literacy, Information Literacy, International Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Funk, Steven C.; Dickson, K. Laurie – Teaching of Psychology, 2011
The authors experimentally investigated the effects of multiple-choice and short-answer format exam items on exam performance in a college classroom. They randomly assigned 50 students to take a 10-item short-answer pretest or posttest on two 50-item multiple-choice exams in an introduction to personality course. Students performed significantly…
Descriptors: Test Items, Test Format, Multiple Choice Tests, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal – Applied Measurement in Education, 2012
This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Descriptors: Testing Accommodations, Test Bias, Item Response Theory, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Scarpati, Stanley E.; Wells, Craig S.; Lewis, Christine; Jirka, Stephen – Journal of Special Education, 2011
The purpose of this study was to use differential item functioning (DIF) and latent mixture model analyses to explore factors that explain performance differences on a large-scale mathematics assessment between examinees allowed to use a calculator or who were afforded item presentation accommodations versus those who did not receive the same…
Descriptors: Testing Accommodations, Test Items, Test Format, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Lakin, Joni M.; Gambrell, James L. – Intelligence, 2012
Measures of broad fluid abilities including verbal, quantitative, and figural reasoning are commonly used in the K-12 school context for a variety of purposes. However, differentiation of these domains is difficult for young children (grades K-2) who lack basic linguistic and mathematical literacy. This study examined the latent factor structure…
Descriptors: Evidence, Validity, Item Response Theory, Numeracy
Peer reviewed Peer reviewed
Schott, G. R.; Bellin, W. – Evaluation & Research in Education, 2001
Developed an approach to account for the impact of item presentation on ensuing constructs in the development of two versions of a self-report measure, the Relational Concept Scale, that was tested with 978 adolescent students in the United Kingdom. Outlines benefits of developing two versions of the scale to protect against presentational bias.…
Descriptors: Adolescents, Foreign Countries, Statistical Bias, Test Construction
Peer reviewed Peer reviewed
Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
Results of 96 theoretical/empirical studies were reviewed to see if they support a taxonomy of 43 rules for writing multiple-choice test items. The taxonomy is the result of an analysis of 46 textbooks dealing with multiple-choice item writing. For nearly half of the rules, no research was found. (SLD)
Descriptors: Classification, Literature Reviews, Multiple Choice Tests, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Hagtvet, Knut A.; Nasser, Fadia M. – Structural Equation Modeling, 2004
This article presents a methodology for examining the content and nature of item parcels as indicators of a conceptually defined latent construct. An essential component of this methodology is the 2-facet measurement model, which includes items and parcels as facets of construct indicators. The 2-facet model tests assumptions required for…
Descriptors: Evaluation Methods, Validity, Test Anxiety, Content Validity
Haladyna, Thomas M. – 1999
This book explains writing effective multiple-choice test items and studying responses to items to evaluate and improve them, two topics that are very important in the development of many cognitive tests. The chapters are: (1) "Providing a Context for Multiple-Choice Testing"; (2) "Constructed-Response and Multiple-Choice Item Formats"; (3)…
Descriptors: Constructed Response, Multiple Choice Tests, Test Construction, Test Format
Peer reviewed Peer reviewed
Crehan, Kevin; Haladyna, Thomas M. – Journal of Experimental Education, 1991
Two item-writing rules were tested: phrasing stems as questions versus partial sentences; and using the "none-of-the-above" option instead of a specific content option. Results with 228 college students do not support the use of either stem type and provide limited evidence to caution against the "none-of-the-above" option.…
Descriptors: College Students, Higher Education, Multiple Choice Tests, Test Construction
Hambleton, Ronald K.; Patsula, Liane – 2000
Whatever the purpose of test adaptation, questions arise concerning the validity of inferences from such adapted tests. This paper considers several advantages and disadvantages of adapting tests from one language and culture to another. The paper also reviews several sources of error or invalidity associated with adapting tests and suggests ways…
Descriptors: Cross Cultural Studies, Cultural Awareness, Quality of Life, Test Construction
Previous Page | Next Page ยป
Pages: 1  |  2