{"title":"基于余弦相似度图排序和关联的孟加拉语文本意见摘要","authors":"Shofi Ullah, Sagar Hossain, K. M. Azharul Hasan","doi":"10.1109/ICBSLP47725.2019.201494","DOIUrl":null,"url":null,"abstract":"The main idea of the automatic extractive text or opinion summarization is to find most important representative small subset of the original document without any loss of important information. There are many existing methods available for text summarization of English, Turkish, Arabic and other languages. But very few attempts has been done for Bangla language because of its having rich morphology and multifaceted structure. In this paper, we propose a joint cosine simillarity based graph ranking and Relevance based scoring and ranking approach for the summarization of bangla text. We developed a stemming algorithm based on Parts of Speech(POS) tagging consisting of around two lakhs POS tags for Bangla texts. A redundancy removal algorithm is also proposed to remove redundancy so that each sentences in the summary represents exactly the most important information in the document. The performance of the proposed approach is evaluated by measuring the recall, precision and f-score based on Rouge metric and it is also showed that proposed approach outperforms to other existing summarization methods for Bangla texts.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Opinion Summarization of Bangla Texts using Cosine Simillarity Based Graph Ranking and Relevance Based Approach\",\"authors\":\"Shofi Ullah, Sagar Hossain, K. M. Azharul Hasan\",\"doi\":\"10.1109/ICBSLP47725.2019.201494\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main idea of the automatic extractive text or opinion summarization is to find most important representative small subset of the original document without any loss of important information. There are many existing methods available for text summarization of English, Turkish, Arabic and other languages. But very few attempts has been done for Bangla language because of its having rich morphology and multifaceted structure. In this paper, we propose a joint cosine simillarity based graph ranking and Relevance based scoring and ranking approach for the summarization of bangla text. We developed a stemming algorithm based on Parts of Speech(POS) tagging consisting of around two lakhs POS tags for Bangla texts. A redundancy removal algorithm is also proposed to remove redundancy so that each sentences in the summary represents exactly the most important information in the document. The performance of the proposed approach is evaluated by measuring the recall, precision and f-score based on Rouge metric and it is also showed that proposed approach outperforms to other existing summarization methods for Bangla texts.\",\"PeriodicalId\":413077,\"journal\":{\"name\":\"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBSLP47725.2019.201494\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBSLP47725.2019.201494","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Opinion Summarization of Bangla Texts using Cosine Simillarity Based Graph Ranking and Relevance Based Approach
The main idea of the automatic extractive text or opinion summarization is to find most important representative small subset of the original document without any loss of important information. There are many existing methods available for text summarization of English, Turkish, Arabic and other languages. But very few attempts has been done for Bangla language because of its having rich morphology and multifaceted structure. In this paper, we propose a joint cosine simillarity based graph ranking and Relevance based scoring and ranking approach for the summarization of bangla text. We developed a stemming algorithm based on Parts of Speech(POS) tagging consisting of around two lakhs POS tags for Bangla texts. A redundancy removal algorithm is also proposed to remove redundancy so that each sentences in the summary represents exactly the most important information in the document. The performance of the proposed approach is evaluated by measuring the recall, precision and f-score based on Rouge metric and it is also showed that proposed approach outperforms to other existing summarization methods for Bangla texts.