ERIC Number: ED649769
Record Type: Non-Journal
Publication Date: 2022
Pages: 297
Abstractor: As Provided
ISBN: 979-8-3575-5235-8
ISSN: N/A
EISSN: N/A
Available Date: N/A
Artificial Neural Networks as Models of Human Language Acquisition
Alex Warstadt
ProQuest LLC, Ph.D. Dissertation, New York University
Data-driven learning uncontroversially plays a role in human language acquisition--how large a role is a matter of much debate. The success of artificial neural networks in NLP in recent years calls for a re-evaluation of our understanding of the possibilities for learning grammar from data alone. This dissertation argues the case for using artificial neural networks to test hypotheses about human language acquisition and presents progress towards this goal from multiple directions. Compared to experiments on human subjects, experiments on artificial learners based on neural networks have massive advantages in terms of ethics, expense, and expanded possibilities for experimental design. I provide a general recipe for building more convincing model learners and designing ablation experiments using them (Chapter 1). Subsequently, I introduce benchmarks including The Corpus of Linguistic Acceptability (CoLA; Chapter 2) and The Benchmark of Linguistic Minimal Pairs for English (BLiMP; Chapter 3) that use acceptability judgments to probe grammatical knowledge in artificial neural networks. Although off-the-shelf neural language models popular in natural language processing today achieve human-level performance on these benchmarks, they are trained on orders of magnitude more linguistic input than children are exposed to, making them unsuitable for studying human language acquisition. Thus, I train language models from scratch in more human-like settings using limited data, and track the acquisition of language-specific inductive bias and grammatical features as a function of the volume of input (Chapters 4 and 5). Results show that there is a remarkably rich signal for grammar learning in raw text data, but current models require considerably more data than a child to learn from it. Chapter 6 is a synthesis of these approaches applied to a long-standing debate in language acquisition regarding the learnability of structure dependence in subject auxiliary inversion. Through a controlled experimental manipulation of the learning environment of neural language models, I find evidence that in the absence of an innate hierarchical bias, direct evidence against a linear rule, while helpful, is not necessary in order for data-driven learners to learn acceptability judgments consistent with a structural generalization. These results not only highlight the importance of considering indirect evidence in learnability debates, but provide a proof-of-concept for the use artificial learners in evaluating previously untestable hypotheses about human language acquisition. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]
Descriptors: Language Acquisition, Artificial Intelligence, Computational Linguistics, Ethics, Models, English, Benchmarking, Grammar, Decision Making, Brain Hemisphere Functions, Natural Language Processing, Linguistic Input, Generalization, Learning Processes, Linguistic Theory
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A