{"title":"基于词频和语义相似度的孟加拉语文本自动摘要","authors":"Avik Sarkar, M. Hossen","doi":"10.1109/ICCITECHN.2018.8631934","DOIUrl":null,"url":null,"abstract":"With the increasing amount of data within the cloud, it is harder to get the expected one. This leads to the idea of text summarization. Automatic text summarization is a tool for summarizing textual data into a short and concise piece of information via which people can have the idea about the content. Several approaches are introduced but there are a little amount of work has been done on Bangla text summarizing techniques due to some different and multifaceted structure of Bangla language. This paper illustrates the implementation of term frequency and semantic sentence similarity based summarizing approaches to summarize a single Bangla document. Removing stopwords, noisy words, lemmatization, tokenization has been done beforehand. Both of these methods return a bunch of top-ranked sentences to create a summary. The rank of a sentence is determined by the term frequency for the first approach and the sentence similarity for the second approach. The experimental result shows a favorable outcome for both of the approaches. Further improvements of these approaches certainly will return an enchanting outcome.","PeriodicalId":355984,"journal":{"name":"2018 21st International Conference of Computer and Information Technology (ICCIT)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Automatic Bangla Text Summarization Using Term Frequency and Semantic Similarity Approach\",\"authors\":\"Avik Sarkar, M. Hossen\",\"doi\":\"10.1109/ICCITECHN.2018.8631934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing amount of data within the cloud, it is harder to get the expected one. This leads to the idea of text summarization. Automatic text summarization is a tool for summarizing textual data into a short and concise piece of information via which people can have the idea about the content. Several approaches are introduced but there are a little amount of work has been done on Bangla text summarizing techniques due to some different and multifaceted structure of Bangla language. This paper illustrates the implementation of term frequency and semantic sentence similarity based summarizing approaches to summarize a single Bangla document. Removing stopwords, noisy words, lemmatization, tokenization has been done beforehand. Both of these methods return a bunch of top-ranked sentences to create a summary. The rank of a sentence is determined by the term frequency for the first approach and the sentence similarity for the second approach. The experimental result shows a favorable outcome for both of the approaches. Further improvements of these approaches certainly will return an enchanting outcome.\",\"PeriodicalId\":355984,\"journal\":{\"name\":\"2018 21st International Conference of Computer and Information Technology (ICCIT)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 21st International Conference of Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITECHN.2018.8631934\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 21st International Conference of Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2018.8631934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Bangla Text Summarization Using Term Frequency and Semantic Similarity Approach
With the increasing amount of data within the cloud, it is harder to get the expected one. This leads to the idea of text summarization. Automatic text summarization is a tool for summarizing textual data into a short and concise piece of information via which people can have the idea about the content. Several approaches are introduced but there are a little amount of work has been done on Bangla text summarizing techniques due to some different and multifaceted structure of Bangla language. This paper illustrates the implementation of term frequency and semantic sentence similarity based summarizing approaches to summarize a single Bangla document. Removing stopwords, noisy words, lemmatization, tokenization has been done beforehand. Both of these methods return a bunch of top-ranked sentences to create a summary. The rank of a sentence is determined by the term frequency for the first approach and the sentence similarity for the second approach. The experimental result shows a favorable outcome for both of the approaches. Further improvements of these approaches certainly will return an enchanting outcome.