NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025
Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…
Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading
Peer reviewed Peer reviewed
Direct linkDirect link
You, Hye Sun; Park, Sunyoung; Marshall, Jill A.; Delgado, Cesar – Research in Science Education, 2022
Growing interest in interdisciplinary (ID) understanding has led to the recent development of four ID assessments, none of which have previously been comprehensively validated. Sources of evidence for the validity of tests include construct validity, such as the internal structure of the test. ID tests may (and should) test both disciplinary (D)…
Descriptors: High School Students, College Students, Interdisciplinary Approach, Test Construction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Dahlke, Katie; Yang, Rui; Martínez, Carmen; Chavez, Suzette; Martin, Alejandra; Hawkinson, Laura; Shields, Joseph; Garland, Marshall; Carle, Jill – Regional Educational Laboratory Southwest, 2017
The New Mexico Public Education Department developed the Kindergarten Observation Tool (KOT) as a multidimensional observational measure of students' knowledge and skills at kindergarten entry. The primary purpose of the KOT is to inform instruction, so that kindergarten teachers can use the information about their students' knowledge and skills…
Descriptors: Test Validity, Observation, Measures (Individuals), Kindergarten
Peer reviewed Peer reviewed
Direct linkDirect link
Reddy, Linda A.; Dudek, Christopher M.; Fabiano, Gregory A.; Peters, Stephanie – School Psychology Quarterly, 2015
This article presents information about the construct validity and reliability of a new teacher self-report measure of classroom instructional and behavioral practices (the Classroom Strategies Scales-Teacher Form; CSS-T). The theoretical underpinnings and empirical basis for the instructional and behavioral management scales are presented.…
Descriptors: Measurement Techniques, Construct Validity, Test Validity, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Cater, Melissa; Ferstel, Sarah D.; O'Neil, Carol E. – Journal of General Education, 2016
Student participation in undergraduate research (ugr) may be influenced by interest in research, future career and educational plans, perceived value of undergraduate research experiences, or perceived competence in research skills. The purpose of this study was to develop a questionnaire that could be used to validly and reliably assess students'…
Descriptors: Undergraduate Students, Student Experience, Questionnaires, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Reddy, Linda A.; Fabiano, Gregory; Dudek, Christopher M.; Hsu, Louis – School Psychology Quarterly, 2013
Research on progress monitoring has almost exclusively focused on student behavior and not on teacher practices. This article presents the development and validation of a new teacher observational assessment (Classroom Strategies Scale) of classroom instructional and behavioral management practices. The theoretical underpinnings and empirical…
Descriptors: Test Construction, Construct Validity, Test Validity, Observation
Peer reviewed Peer reviewed
Direct linkDirect link
Hassan, Nurul Huda; Shih, Chih-Min – Language Assessment Quarterly, 2013
This article describes and reviews the Singapore-Cambridge General Certificate of Education Advanced Level General Paper (GP) examination. As a written test that is administered to preuniversity students, the GP examination is internationally recognised and accepted by universities and employers as proof of English competence. In this article, the…
Descriptors: Foreign Countries, College Entrance Examinations, English (Second Language), Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wagler, Amy; Wagler, Ron – International Journal of Science Education, 2013
The Measure of Acceptance of the Theory of Evolution (MATE) was constructed to be a single-factor instrument that assesses an individual's overall acceptance of evolutionary theory. The MATE was validated and the scores resulting from the MATE were found to be reliable for the population of inservice high school biology teachers. However, many…
Descriptors: Evolution, Theories, Measures (Individuals), Preservice Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Mislevy, Robert J.; Haertel, Geneva; Cheng, Britte H.; Ructtinger, Liliana; DeBarger, Angela; Murray, Elizabeth; Rose, David; Gravel, Jenna; Colker, Alexis M.; Rutstein, Daisy; Vendlinski, Terry – Educational Research and Evaluation, 2013
Standardizing aspects of assessments has long been recognized as a tactic to help make evaluations of examinees fair. It reduces variation in irrelevant aspects of testing procedures that could advantage some examinees and disadvantage others. However, recent attention to making assessment accessible to a more diverse population of students…
Descriptors: Testing Accommodations, Access to Education, Testing, Psychometrics
Peoples, Shelagh – ProQuest LLC, 2012
The purpose of this study was to determine which of three competing models will provide, reliable, interpretable, and responsive measures of elementary students' understanding of the nature of science (NOS). The Nature of Science Instrument-Elementary (NOSI-E), a 28-item Rasch-based instrument, was used to assess students' NOS…
Descriptors: Scientific Principles, Science Tests, Elementary School Students, Item Response Theory
Yoon, So Yoon – ProQuest LLC, 2011
Working under classical test theory (CTT) and item response theory (IRT) frameworks, this study investigated psychometric properties of the Revised Purdue Spatial Visualization Tests: Visualization of Rotations (Revised PSVT:R). The original version, the PSVT:R was designed by Guay (1976) to measure spatial visualization ability in…
Descriptors: Undergraduate Students, Test Bias, Guessing (Tests), Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Ying; Elder, Catherine – Language Assessment Quarterly, 2009
The College English Test-Spoken English Test is a nationwide spoken English test designed to assess the oral communicative ability of Chinese university and college students who have undertaken compulsory English study at a Chinese university. This article describes the test and evaluates it in terms of reliability, validity, authenticity,…
Descriptors: Test Results, Language Tests, Rating Scales, Foreign Countries
Nenty, H. Johnson – 1986
The Cattell Culture Fair Intelligence Test (CCFIT) was administered to a large sample of American, Nigerian, and Indian adolescents, and item data were examined for cultural bias. The CCFIT was designed to measure fluid intelligence, which is not influenced by cultural differences. Four different item analysis techniques were used to determine…
Descriptors: Construct Validity, Cross Cultural Studies, Cultural Influences, Culture Fair Tests