抽象文本摘要器:点积注意力与余弦相似度的比较研究

K. S, Naveen L, A. Raj, R. S, A. S
{"title":"抽象文本摘要器:点积注意力与余弦相似度的比较研究","authors":"K. S, Naveen L, A. Raj, R. S, A. S","doi":"10.1109/icecct52121.2021.9616710","DOIUrl":null,"url":null,"abstract":"Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.","PeriodicalId":155129,"journal":{"name":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","volume":"487 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Abstractive Text Summarizer: A Comparative Study on Dot Product Attention and Cosine Similarity\",\"authors\":\"K. S, Naveen L, A. Raj, R. S, A. S\",\"doi\":\"10.1109/icecct52121.2021.9616710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.\",\"PeriodicalId\":155129,\"journal\":{\"name\":\"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)\",\"volume\":\"487 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icecct52121.2021.9616710\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icecct52121.2021.9616710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

文本摘要是提取文档的一个子集的过程,以这样一种方式,即理解文章所传达的思想,同时省略对文章没有任何影响的外围细节。这项工作的目的是设计一个抽象的文本摘要器,使用自然语言处理,将一篇报纸文章作为输入,并提供大约100个单词的文章摘要。该模型的设计使用了序列到序列(Sequence to Sequence)架构和注意机制,这样模型就可以学会注意重要的单词,而不是试图记住所有的单词。该模型使用Kaggle提供的包含报纸文章及其摘要的数据集进行训练。预训练的模型(如BERT和T5)也用于生成摘要,并根据预训练的模型评估拟议模型的性能。Seq-Seq、BERT和T5这三种模型分别在BBC-News-Dataset、Amazon food reviews、News-summary和NewsRoom 4个数据集上进行了评估。分析它们的胭脂分数,以选择理想的算法进行总结。注意机制被定制为使用余弦相似度而不是点积。余弦相似度被发现在短摘要的情况下工作得更好,而点积被发现在长摘要中工作得更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Abstractive Text Summarizer: A Comparative Study on Dot Product Attention and Cosine Similarity
Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信