Si Huang, Rui Wang, Qing Xie, Lin Li, Yongjian Liu
{"title":"一种长文档摘要的抽取-抽象混合方法","authors":"Si Huang, Rui Wang, Qing Xie, Lin Li, Yongjian Liu","doi":"10.1109/BESC48373.2019.8962979","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a hybrid model of extractive and abstractive methods to tackle the long document automatic summarization task. The model first trains an extractor to extract salient sentences from the original text. Next, these salient sentences are put together to get a condensed version of the original text. Then we use the abstractive model to rewrite the extracted sentences to get the final summary. In order to avoid the exposure bias, reinforcement training is used to optimize the proposed model. Experiments in NLPCC2017 Shared Task 3 show that our models achieve competitive performance. Additionally, the ROUGE score of our model exceeds the score of the state-of-the-art model in the original NLPCC2017 Shared Task 3, where a sentence summary is generated from each Chinese news article.","PeriodicalId":190867,"journal":{"name":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Extraction-Abstraction Hybrid Approach for Long Document Summarization\",\"authors\":\"Si Huang, Rui Wang, Qing Xie, Lin Li, Yongjian Liu\",\"doi\":\"10.1109/BESC48373.2019.8962979\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a hybrid model of extractive and abstractive methods to tackle the long document automatic summarization task. The model first trains an extractor to extract salient sentences from the original text. Next, these salient sentences are put together to get a condensed version of the original text. Then we use the abstractive model to rewrite the extracted sentences to get the final summary. In order to avoid the exposure bias, reinforcement training is used to optimize the proposed model. Experiments in NLPCC2017 Shared Task 3 show that our models achieve competitive performance. Additionally, the ROUGE score of our model exceeds the score of the state-of-the-art model in the original NLPCC2017 Shared Task 3, where a sentence summary is generated from each Chinese news article.\",\"PeriodicalId\":190867,\"journal\":{\"name\":\"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BESC48373.2019.8962979\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC48373.2019.8962979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Extraction-Abstraction Hybrid Approach for Long Document Summarization
In this paper, we propose a hybrid model of extractive and abstractive methods to tackle the long document automatic summarization task. The model first trains an extractor to extract salient sentences from the original text. Next, these salient sentences are put together to get a condensed version of the original text. Then we use the abstractive model to rewrite the extracted sentences to get the final summary. In order to avoid the exposure bias, reinforcement training is used to optimize the proposed model. Experiments in NLPCC2017 Shared Task 3 show that our models achieve competitive performance. Additionally, the ROUGE score of our model exceeds the score of the state-of-the-art model in the original NLPCC2017 Shared Task 3, where a sentence summary is generated from each Chinese news article.