Yong Zhang, Jinzhi Liao, Jiuyang Tang, W. Xiao, Yuheng Wang
{"title":"基于层次GRU的抽取文档摘要","authors":"Yong Zhang, Jinzhi Liao, Jiuyang Tang, W. Xiao, Yuheng Wang","doi":"10.1109/ICRIS.2018.00092","DOIUrl":null,"url":null,"abstract":"Neural network has provided an efficient approach for extractive document summarization, which means selecting sentences from the text to form the summary. However, there are two shortcomings about the conventional methods: they directly extract summary from the whole document which contains huge redundancy, and they neglect relations between abstraction and the document. The paper proposes TSERNN, a two-stage structure, the first of which is a key-sentence extraction, followed by the Recurrent Neural Network-based model to handle the extractive summarization of documents. In the extraction phase, it conceives a hybrid sentence similarity measure by combining sentence vector and Levenshtein distance, and integrates it into graph model to extract key sentences. In the second phase, it constructs GRU as basic blocks, and put the representation of entire document based on LDA as a feature to support summarization. Finally, the model is tested on CNN/Daily Mail corpus, and experimental results verify the accuracy and validity of the proposed method.","PeriodicalId":194515,"journal":{"name":"2018 International Conference on Robots & Intelligent System (ICRIS)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Extractive Document Summarization Based on Hierarchical GRU\",\"authors\":\"Yong Zhang, Jinzhi Liao, Jiuyang Tang, W. Xiao, Yuheng Wang\",\"doi\":\"10.1109/ICRIS.2018.00092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural network has provided an efficient approach for extractive document summarization, which means selecting sentences from the text to form the summary. However, there are two shortcomings about the conventional methods: they directly extract summary from the whole document which contains huge redundancy, and they neglect relations between abstraction and the document. The paper proposes TSERNN, a two-stage structure, the first of which is a key-sentence extraction, followed by the Recurrent Neural Network-based model to handle the extractive summarization of documents. In the extraction phase, it conceives a hybrid sentence similarity measure by combining sentence vector and Levenshtein distance, and integrates it into graph model to extract key sentences. In the second phase, it constructs GRU as basic blocks, and put the representation of entire document based on LDA as a feature to support summarization. Finally, the model is tested on CNN/Daily Mail corpus, and experimental results verify the accuracy and validity of the proposed method.\",\"PeriodicalId\":194515,\"journal\":{\"name\":\"2018 International Conference on Robots & Intelligent System (ICRIS)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Robots & Intelligent System (ICRIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRIS.2018.00092\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Robots & Intelligent System (ICRIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRIS.2018.00092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Extractive Document Summarization Based on Hierarchical GRU
Neural network has provided an efficient approach for extractive document summarization, which means selecting sentences from the text to form the summary. However, there are two shortcomings about the conventional methods: they directly extract summary from the whole document which contains huge redundancy, and they neglect relations between abstraction and the document. The paper proposes TSERNN, a two-stage structure, the first of which is a key-sentence extraction, followed by the Recurrent Neural Network-based model to handle the extractive summarization of documents. In the extraction phase, it conceives a hybrid sentence similarity measure by combining sentence vector and Levenshtein distance, and integrates it into graph model to extract key sentences. In the second phase, it constructs GRU as basic blocks, and put the representation of entire document based on LDA as a feature to support summarization. Finally, the model is tested on CNN/Daily Mail corpus, and experimental results verify the accuracy and validity of the proposed method.