ERIC Number: EJ1464701
Record Type: Journal
Publication Date: 2025-Apr
Pages: 22
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-0926-7220
EISSN: EISSN-1573-1901
Available Date: 2024-01-29
Can Generative AI and ChatGPT Outperform Humans on Cognitive-Demanding Problem-Solving Tasks in Science?
Xiaoming Zhai1,2; Matthew Nyaaba1,3; Wenchao Ma4
Science & Education, v34 n2 p649-670 2025
This study aimed to examine an assumption regarding whether generative artificial intelligence (GAI) tools can overcome the cognitive intensity that humans suffer when solving problems. We examine the performance of ChatGPT and GPT-4 on NAEP science assessments and compare their performance to students by cognitive demands of the items. Fifty-four 2019 NAEP science assessment tasks were coded by content experts using a two-dimensional cognitive load framework, including task cognitive complexity and dimensionality. ChatGPT and GPT-4 answered the questions individually and were scored using the scoring keys provided by NAEP. The analysis of the available data for this study was based on the average student ability scores for students who answered each item correctly and the percentage of students who responded to individual items. The results showed that both ChatGPT and GPT-4 consistently outperformed most students who answered each individual item in the NAEP science assessments. As the cognitive demand for NAEP science assessments increases, statistically higher average student ability scores are required to correctly address the questions. This pattern was observed for Grades 4, 8, and 12 students respectively. However, ChatGPT and GPT-4 were not statistically sensitive to the increase of cognitive demands of the tasks, except for Grade 4. As the first study focusing on comparing cutting-edge GAI and K-12 students in problem-solving in science, this finding implies the need for changes to educational objectives to prepare students with competence to work with GAI tools such as ChatGPT and GPT-4 in the future. Education ought to emphasize the cultivation of advanced cognitive skills rather than depending solely on tasks that demand cognitive intensity. This approach would foster critical thinking, analytical skills, and the application of knowledge in novel contexts among students. Furthermore, the findings suggest that researchers should innovate assessment practices by moving away from cognitive intensity tasks toward creativity and analytical skills to more efficiently avoid the negative effects of GAI on testing.
Descriptors: Artificial Intelligence, National Competency Tests, Elementary Secondary Education, Problem Solving, Science Achievement, Cognitive Processes, Difficulty Level, Comparative Analysis, Influence of Technology
Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link-springer-com.bibliotheek.ehb.be/
Publication Type: Journal Articles; Reports - Research
Education Level: Elementary Secondary Education
Audience: N/A
Language: English
Sponsor: National Science Foundation (NSF)
Authoring Institution: N/A
Identifiers - Assessments and Surveys: National Assessment of Educational Progress
Grant or Contract Numbers: 2101104; 2138854
Author Affiliations: 1University of Georgia, AI4STEM Education Center, Athens, USA; 2University of Georgia, Department of Mathematics, Science, and Social Studies Education, Athens, USA; 3University of Georgia, Department of Educational Theory and Practice, Athens, USA; 4University of Alabama, College of Education, Tuscaloosa, USA