Marathi Extractive Text Summarization using Latent Semantic Analysis and Fuzzy Algorithms

Virat V. Giri, Dr.M.M. Math, Dr.U.P. Kulkarni
{"title":"Marathi Extractive Text Summarization using Latent Semantic Analysis and Fuzzy Algorithms","authors":"Virat V. Giri, Dr.M.M. Math, Dr.U.P. Kulkarni","doi":"10.36647/ciml/04.01.a008","DOIUrl":null,"url":null,"abstract":"Extractive text summarization involves the retention of only the most important sentences in a document. In the past, multiple approaches involving both statistical and machine learning-based methods have been used for this task. The crucial step in extractive text summarization is getting the right ranking order of sentences in the document in terms of their importance. Singular value decomposition or SVD algorithm based on latent semantic analysis focuses on recognizing the sections in the document which are related in terms of their semantic nature. Fuzzy algorithms involve reasoning of the priority order of the sentences using fuzzy logic unlike the use of discrete values. While significant work has been done for extractive text summarization in English and other foreign languages, there is ample scope for improving the performance of systems when dealing with Marathi text. In this paper, SVD and fuzzy algorithms are proposed for performing extractive text summarization on Marathi documents. Work is done upon the modeling principle, data flow, and parameters of these algorithms such that they are best suited for the task. An analysis of the characteristics of both these techniques is conducted to compare their benefits and shortcomings. The performance of both the algorithms is evaluated on a document dataset using standard performance metrics including the ROUGE metric. An unbiased comparison of both these techniques is carried out to inform the applicability of them, especially when working with Marathi or in general, non-English text.","PeriodicalId":203221,"journal":{"name":"Computational Intelligence and Machine Learning","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Intelligence and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36647/ciml/04.01.a008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Extractive text summarization involves the retention of only the most important sentences in a document. In the past, multiple approaches involving both statistical and machine learning-based methods have been used for this task. The crucial step in extractive text summarization is getting the right ranking order of sentences in the document in terms of their importance. Singular value decomposition or SVD algorithm based on latent semantic analysis focuses on recognizing the sections in the document which are related in terms of their semantic nature. Fuzzy algorithms involve reasoning of the priority order of the sentences using fuzzy logic unlike the use of discrete values. While significant work has been done for extractive text summarization in English and other foreign languages, there is ample scope for improving the performance of systems when dealing with Marathi text. In this paper, SVD and fuzzy algorithms are proposed for performing extractive text summarization on Marathi documents. Work is done upon the modeling principle, data flow, and parameters of these algorithms such that they are best suited for the task. An analysis of the characteristics of both these techniques is conducted to compare their benefits and shortcomings. The performance of both the algorithms is evaluated on a document dataset using standard performance metrics including the ROUGE metric. An unbiased comparison of both these techniques is carried out to inform the applicability of them, especially when working with Marathi or in general, non-English text.
基于潜在语义分析和模糊算法的马拉地语提取文本摘要
摘录文本摘要包括只保留文档中最重要的句子。在过去,涉及统计和基于机器学习的方法的多种方法已用于此任务。摘要摘要的关键步骤是根据句子的重要性对句子进行排序。基于潜在语义分析的奇异值分解(SVD)算法侧重于识别文档中语义性质相关的部分。模糊算法涉及使用模糊逻辑推理句子的优先顺序,而不像使用离散值。虽然在英语和其他外语的抽取文本摘要方面已经做了大量的工作,但在处理马拉地语文本时,系统的性能仍有很大的改进空间。本文提出了SVD和模糊算法对马拉地语文档进行抽取文本摘要。工作是根据这些算法的建模原理、数据流和参数完成的,以便它们最适合任务。分析了这两种技术的特点,比较了它们的优点和缺点。使用标准性能指标(包括ROUGE指标)在文档数据集上评估这两种算法的性能。对这两种技术进行了公正的比较,以告知它们的适用性,特别是在处理马拉地语或一般非英语文本时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信