Kamrus Salehin, M. Alam, Md. Ashifun Nabi, Fahim Ahmed, Faisal Bin Ashraf
{"title":"A Comparative Study of Different Text Classification Approaches for Bangla News Classification","authors":"Kamrus Salehin, M. Alam, Md. Ashifun Nabi, Fahim Ahmed, Faisal Bin Ashraf","doi":"10.1109/ICCIT54785.2021.9689843","DOIUrl":null,"url":null,"abstract":"At present, we have seen everything is getting digitized where technology almost takes full control over our life. As a result, a massive number of textual documents are generated on online platforms and news articles are no exception. People prefer to get connected with online news portals as they are updated every single hour. Newspaper articles have so many categories such as politics, sports, business, entertainment etc. Recently, we have noticed the rapid growth and increase of Bangla online news portals on the internet. It will be helpful for the online readers to get recommended the preferable news category which assists them in locating desired articles. Manually categorizing news articles takes huge time and effort. So, text categorization is necessary for the modern day, as enormous amounts of uncategorized data are an issue here. Although the study has improved in categorizing news articles greatly for languages such as English, Arabic, Chinese, Urdu, and Hindi. Among others, the Bangla language has shown little development. However, some approaches were applied to categorize Bangla news articles, using some machine learning algorithms where resources were minimum. We have applied five machine learning classifiers and two neural networks to categorize Bangla news articles where neural network LSTM performed best. To show the comparison between applied algorithms, which one is performing better, we have used four metrics that measure performance.","PeriodicalId":166450,"journal":{"name":"2021 24th International Conference on Computer and Information Technology (ICCIT)","volume":"208 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 24th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT54785.2021.9689843","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
At present, we have seen everything is getting digitized where technology almost takes full control over our life. As a result, a massive number of textual documents are generated on online platforms and news articles are no exception. People prefer to get connected with online news portals as they are updated every single hour. Newspaper articles have so many categories such as politics, sports, business, entertainment etc. Recently, we have noticed the rapid growth and increase of Bangla online news portals on the internet. It will be helpful for the online readers to get recommended the preferable news category which assists them in locating desired articles. Manually categorizing news articles takes huge time and effort. So, text categorization is necessary for the modern day, as enormous amounts of uncategorized data are an issue here. Although the study has improved in categorizing news articles greatly for languages such as English, Arabic, Chinese, Urdu, and Hindi. Among others, the Bangla language has shown little development. However, some approaches were applied to categorize Bangla news articles, using some machine learning algorithms where resources were minimum. We have applied five machine learning classifiers and two neural networks to categorize Bangla news articles where neural network LSTM performed best. To show the comparison between applied algorithms, which one is performing better, we have used four metrics that measure performance.