NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: ED620158
Record Type: Non-Journal
Publication Date: 2021-Nov
Pages: 10
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Mitigating Data Scarceness through Data Synthesis, Augmentation and Curriculum for Abstractive Summarization
Magooda, Ahmed; Litman, Diane
Grantee Submission, Paper presented at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021) (Nov 7-11, 2021)
This paper explores three simple data manipulation techniques (synthesis, augmentation, curriculum) for improving abstractive summarization models without the need for any additional data. We introduce a method of data synthesis with paraphrasing, a data augmentation technique with sample mixing, and curriculum learning with two new difficulty metrics based on specificity and abstractiveness. We conduct experiments to show that these three techniques can help improve abstractive summarization across two summarization models and two different small datasets. Furthermore, we show that these techniques can improve performance when applied in isolation and when combined. [This paper was published in: "Findings of the Association for Computational Linguistics: EMNLP 2021," Association for Computational Linguistics, 2021, pp. 2043-2052.]
Publication Type: Speeches/Meeting Papers; Reports - Descriptive
Education Level: N/A
Audience: N/A
Language: English
Sponsor: Institute of Education Sciences (ED)
Authoring Institution: N/A
IES Funded: Yes
Grant or Contract Numbers: R305A180477
Author Affiliations: N/A