An Explorative Study on Extractive Text Summarization through k-means, LSA, and TextRank

K. Ramani, K. Bhavana, A. Akshaya, K. Harshita, C. R. Thoran Kumar, Maya Srikanth
{"title":"An Explorative Study on Extractive Text Summarization through k-means, LSA, and TextRank","authors":"K. Ramani, K. Bhavana, A. Akshaya, K. Harshita, C. R. Thoran Kumar, Maya Srikanth","doi":"10.1109/WiSPNET57748.2023.10134303","DOIUrl":null,"url":null,"abstract":"Notably the difficult and exciting issue in the field of Natural Language Processing (NLP) is summarizing the text. Understanding the main objective of any type of document is crucial. Some of the applications of text summarization are media monitoring, social media, marketing, health care, literature, and books. Text summarization techniques are implemented using extractive summarization techniques in the health care domain in which it considers patient health history. To visualize a lengthy patient health history document quickly we use machine learning techniques like k-means, Text Rank, and Latent Semantic Analysis to comprehend and identify the sections that communicate important information to produce the summarized texts. These methods are evaluated using ROUGE-1, ROUGE-2, and ROUGE-N metrics to obtain the highest similarity of extracted text. k-means outperformed the considered approaches compared to Text Rank and Latent Semantic Analysis in summarizing the documents. k-Means was more efficient, where it achieved an average of 94.52% precision, 90.98% recall, and 91.25% F1-score.","PeriodicalId":150576,"journal":{"name":"2023 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WiSPNET57748.2023.10134303","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Notably the difficult and exciting issue in the field of Natural Language Processing (NLP) is summarizing the text. Understanding the main objective of any type of document is crucial. Some of the applications of text summarization are media monitoring, social media, marketing, health care, literature, and books. Text summarization techniques are implemented using extractive summarization techniques in the health care domain in which it considers patient health history. To visualize a lengthy patient health history document quickly we use machine learning techniques like k-means, Text Rank, and Latent Semantic Analysis to comprehend and identify the sections that communicate important information to produce the summarized texts. These methods are evaluated using ROUGE-1, ROUGE-2, and ROUGE-N metrics to obtain the highest similarity of extracted text. k-means outperformed the considered approaches compared to Text Rank and Latent Semantic Analysis in summarizing the documents. k-Means was more efficient, where it achieved an average of 94.52% precision, 90.98% recall, and 91.25% F1-score.
基于k-means、LSA和TextRank的抽取文本摘要的探索性研究
值得注意的是,自然语言处理(NLP)领域的难点和令人兴奋的问题是总结文本。理解任何类型文档的主要目的都是至关重要的。文本摘要的一些应用包括媒体监控、社交媒体、市场营销、医疗保健、文学和书籍。文本摘要技术是使用医疗保健领域的提取摘要技术实现的,其中考虑了患者的健康史。为了快速可视化冗长的患者健康史文档,我们使用k-means、文本秩和潜在语义分析等机器学习技术来理解和识别传达重要信息的部分,以生成摘要文本。使用ROUGE-1, ROUGE-2和ROUGE-N指标对这些方法进行评估,以获得提取文本的最高相似度。与文本秩和潜在语义分析相比,k-means在总结文档方面优于所考虑的方法。k-Means更有效,平均准确率为94.52%,召回率为90.98%,F1-score为91.25%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信