NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED669190
Record Type: Non-Journal
Publication Date: 2021
Pages: 139
Abstractor: As Provided
ISBN: 979-8-5355-9671-6
ISSN: N/A
EISSN: N/A
Available Date: 0000-00-00
Context Matters: Employing Word Embeddings to Improve Text Classifier Performance on Peer-Reviewed Academic Journal Abstracts--A Test Case
Arielle A. Gaither
ProQuest LLC, D.Engr. Dissertation, The George Washington University
Bag-of-words is a commonly used text representation method for many text classification applications. However, bag-of-words representation fails to consider the context of the text because it only examines text documents based on the presence of individual words and explores relationships between texts with similar word choices (Bengfort, 2018). Understanding the context of a text is important to classify closely related texts correctly. In recent years, research has emerged to classify large corpora using a subset of the text specifically abstracts and metadata. However, this research almost exclusively focuses on medical and biomedical data sets derived from MEDLINE including the 2014 BioASQ Challenge data set for biomedical semantic indexing. This research aimed to show the benefit, in terms of increased text classification performance, of employing semantic analysis in data preprocessing to classify peer-reviewed journal abstracts by subject. The results of this research showed that semantic analysis preprocessing did not significantly improve classification performance for the research dataset. However, text classification is a viable option to automate some requirements elicitation activities and reduce the amount of manual intervention required to review and distribute requests for new Department of Defense Information Technology projects. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A