ERIC Number: EJ1478271
Record Type: Journal
Publication Date: 2025-Jul
Pages: 12
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1470-8175
EISSN: EISSN-1539-3429
Available Date: 2025-05-05
Using ChatGPT as a Tool for Training Nonprogrammers to Generate Genomic Sequence Analysis Code
Haley A. Delcher1; Enas S. Alsatari1; Adeyeye I. Haastrup1; Sayema Naaz1; Lydia A. Hayes-Guastella2; Autumn M. McDaniel3; Olivia G. Clark4; Devin M. Katerski4; Francois O. Prinsloo4; Olivia R. Roberts4; Meredith A. Shaddix3; Bridgette N. Sullivan4; Isabella M. Swan4; Emily M. Hartsell1; Jeffrey D. DeMeis1; Sunita S. Paudel1; Glen M. Borchert1,4
Biochemistry and Molecular Biology Education, v53 n4 p433-444 2025
Today, due to the size of many genomes and the increasingly large sizes of sequencing files, independently analyzing sequencing data is largely impossible for a biologist with little to no programming expertise. As such, biologists are typically faced with the dilemma of either having to spend a significant amount of time and effort to learn how to program themselves or having to identify (and rely on) an available computer scientist to analyze large sequence data sets. That said, the advent of AI-powered programs like ChatGPT may offer a means of circumventing the disconnect between biologists and their analysis of genomic data critically important to their field. The work detailed herein demonstrates how implementing ChatGPT into an existing Course-based Undergraduate Research Experience curriculum can provide a means for equipping biology students with no programming expertise the power to generate their own programs and allow those students to carry out a publishable, comprehensive analysis of real-world Next Generation Sequencing (NGS) datasets. Relying solely on the students' biology background as a prompt for directing ChatGPT to generate Python codes, we found students could readily generate programs able to deal with and analyze NGS datasets greater than 10 gigabytes. In summary, we believe that integrating ChatGPT into education can help bridge a critical gap between biology and computer science and may prove similarly beneficial in other disciplines. Additionally, ChatGPT can provide biological researchers with powerful new tools capable of mediating NGS dataset analysis to help accelerate major new advances in the field.
Descriptors: Artificial Intelligence, Technology Uses in Education, Training, Teaching Methods, Programming, Genetics, Coding, Undergraduate Students, Biology, Prompting, Technology Integration, Computer Science
Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www-wiley-com.bibliotheek.ehb.be/en-us
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: National Science Foundation (NSF), Division of Integrative Organismal Systems (IOS); National Science Foundation (NSF), Division of Molecular and Cellular Biosciences (MCB)
Authoring Institution: N/A
Grant or Contract Numbers: 2243532; 2219900
Data File: URL: https://github.com/glen-borchert/ChatGPTgeneratedPythonCodes-ForBioinformaticAnalysis
Author Affiliations: 1Department of Pharmacology, University of South Alabama, Mobile, Alabama, USA; 2Stokes School of Marine and Environmental Sciences, University of South Alabama, Mobile, Alabama, USA; 3Department of Biomedical Sciences, University of South Alabama, Mobile, Alabama, USA; 4Department of Biology, University of South Alabama, Mobile, Alabama, USA

Peer reviewed
Direct link
