NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED662300
Record Type: Non-Journal
Publication Date: 2024
Pages: 239
Abstractor: As Provided
ISBN: 979-8-3840-3952-5
ISSN: N/A
EISSN: N/A
Available Date: N/A
Japanese Particles in Learner Language: Corpus Design, Automatic Annotation, and Analysis
Misato Hiraga
ProQuest LLC, Ph.D. Dissertation, Indiana University
This dissertation developed a new learner corpus of Japanese and introduced an error and linguistic annotation scheme specifically designed for Japanese particles. The corpus contains texts written by learners who are in the first year to fourth year university level Japanese courses. The texts in the corpus were tagged with part-of-speech and syntactic annotations, and the particles were annotated for correctness, the correct particle when incorrect, and their correct function in the sentence. The consistency of this annotation approach was evaluated through inter-annotator agreement. Additionally, this study developed an automated annotation system using machine learning. The system is designed to predict not only the correctness of particles but also their correct function. The system performed well on predicting the majority class although the minority classes posed a challenge for the system. Dominant classes represented over 90% of the data, causing models to be biased toward these majority classes and perform poorly on minority classes. To address this issue, three strategies were employed: oversampling, adding synthetic data, and combining both methods. The oversampling technique demonstrated the best performance, achieving the highest F1 scores in five out of eight classification tasks. To demonstrate the utility of the corpus with error and linguistic annotations, two case studies and pedagogical suggestions were presented. The first case demonstrates that analyzing all instances of a single particle can be misleading, as confusion often arises from only one of its many functions. The second case study is where two different uses of one particle lead to two different confusion patterns with other particles. The two cases were addressed differently as different functions of a particle involve different types of errors. By pinpointing these errors, we gain a deeper understanding of learners' challenges and gain insights into effective teaching methods. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A