{"title":"Sentiment Analysis on Movie Review Data Using Machine Learning Approach","authors":"Atiqur Rahman, M. Hossen","doi":"10.1109/ICBSLP47725.2019.201470","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201470","url":null,"abstract":"At present Sentiment analysis is the most discussed topic which is purposed to assist one to get important information from a large dataset. It centers on the investigation and comprehension of the feelings from the text patterns. It automatically characterizes the expression of feelings, e.g., negative, positive or neutral about the existence of anything. Various sources like medical, social media, newspaper, and movie review can be used in data analysis. Here, we have collected movie review data as well as used five kinds of machine learning classifiers to analyze these data. Hence, the considered classifiers are Bernoulli Naïve Bayes (BNB), Decision Tree (DE), Support Vector Machine (SVM), Maximum Entropy (ME), as well as Multinomial Naïve Bayes (MNB). Our analysis outlines that MNB achieves better accuracy, precision and F-score while SVM shows higher recall compared to others. Besides it also show that BNB Classifier achieves better accuracy than previous experiment over this classifier.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130798293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sourav Sarker, Syeda Tamanna Alam Monisha, Md Mahadi Hasan Nahid
{"title":"Bengali Question Answering System for Factoid Questions: A statistical approach","authors":"Sourav Sarker, Syeda Tamanna Alam Monisha, Md Mahadi Hasan Nahid","doi":"10.1109/ICBSLP47725.2019.201512","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201512","url":null,"abstract":"Question answering system in recent days is one of the most trending and interesting topics of research in computational linguistics. Bengali being among the most spoken languages in the world has yet faced difficulties in computational linguistics. This paper demonstrates an attempt to develop a closed domain factoid question answering system for Bengali language. Our proposed system combining multiple sources for answer extraction extracts the answer having the accuracy 66.2% and 56.8% with and without mentioning the object name respectively. The system also hits around 72% documents from which the answer can be extracted. Besides the sub-parts of our system, the question and document classifier provides 90.6% and 75.3% accuracy respectively over five coarse-grained categories.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116911369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bangla Numeral Recognition from Speech Signal Using Convolutional Neural Network","authors":"M. Shuvo, Shaikh Akib Shahriyar, M. Akhand","doi":"10.1109/ICBSLP47725.2019.201540","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201540","url":null,"abstract":"Speech recognition is a process where an acoustic signal is converted to text or words or commands and recognizing the speech. In this paper, a Bangla numeral recognition system from the speech signal is developed utilizing Convolutional Neural Network (CNN). In the proposed system, a speech dataset of ten isolated Bangla digits has been developed consists of 6000 utterances (5 utterances for every 120 speakers) and a feature extraction procedure is performed to elicit significant features from the speech signals using Mel Frequency Cepstrum Coefficient (MFCC) analysis. Then, CNN is trained with the features of the speech signal as input. The efficiency of the proposed system is tested on the dataset developed for this purpose, and acquire 93.65% recognition accuracy. The proposed system is also compared with other existing methods of Bangla numeral speech recognition and outperforms most of the existing systems and proves the superiority of itself.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117222039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cricket Sentiment Analysis from Bangla Text Using Recurrent Neural Network with Long Short Term Memory Model","authors":"Md Ferdous Wahid, Md. Jahid Hasan, Md. Shahin Alom","doi":"10.1109/ICBSLP47725.2019.201500","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201500","url":null,"abstract":"Nowadays, people used to express their feelings, thoughts, suggestions and opinions on different social platform and video sharing media. Many discussions are made on Twitter, Facebook and many respective forums on sports especially cricket and football. The opinion may express criticism in different manner, notation that may comprise different polarity like positive, negative or neutral and it is a challenging task even for human to understand the sentiment of each opinion as well as time consuming. This problem can be solved by analyzing sentiment in respective comments through natural language processing (NLP). Along with the success of many deep learning domains, Recurrent Neural Network (RNN) with Long-Short-Term-Memory (LSTM) is popularly used in NLP task like sentiment analysis. We have prepared a dataset about cricket comment in Bangla text of real people sentiments in three categories i.e. positive, negative and neutral and processed it by removing unnecessary words from the dataset. Then we have used word embedding method for vectorization of each word and for long term dependencies we used LSTM. The accuracy of this approach has given 95% that beyond the accuracy of previous all method.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122746729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthetic Class Specific Bangla Handwritten Character Generation Using Conditional Generative Adversarial Networks","authors":"Zinnia Khan Nishat, Md Shopon","doi":"10.1109/ICBSLP47725.2019.201475","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201475","url":null,"abstract":"Bangla handwritten character recognition is known to be one of the most classical problem in the field of machine learning. In order to solve a machine learning problem one must thing is dataset. The more varied data a model sees the better it learns. Generative adversarial networks (GANs) are a group of neural networks that are used in unsupervised machine learning. It helps to resolve many difficult operations such as image generation from description, transforming low resolution image into high resolution, retrieving image contents given a small pattern etc. GAN's have many other promising applications in machine learning. There are many variations available for GAN. One of the variation of GAN is Conditional Generative Adversarial Networks(cGAN). This kind of GAN is used for generating a specific type of image. In this work we have used cGAN for generating Class Based Character Generation. This work can help researchers to generate handwritten characters to enhance the perfomance of deep learning models. We have trained this model to generate 50 Basic Bangla Characters, 10 Bangla Numerals and 24 Compound characters.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128206541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Rahman, H. Khan, Zakir Hossain, Mahfuza Begum, Sadia Mahanaz, Ashraful Islam, Aminul Islam
{"title":"An Annotated Bangla Sentiment Analysis Corpus","authors":"F. Rahman, H. Khan, Zakir Hossain, Mahfuza Begum, Sadia Mahanaz, Ashraful Islam, Aminul Islam","doi":"10.1109/ICBSLP47725.2019.201474","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201474","url":null,"abstract":"This paper presents a Bangla corpus specifically targeted for sentiment analysis and made available to researchers under an open-source licensing scheme1. We have collected and manually annotated over 10,000 sentences with sentiment polarity. We then moved to the Word domain and annotated over 15,000 words derived from these sentences with sentiment polarity. Each entry is the corpus has been cross-annotated by at least two and sometimes three annotators for ensuring quality. Also as a pre-requisite of creating a high quality sentiment analysis corpus, we had to build a secondary corpus for Bangla word stemming, which is also been cross-validated by at least two and sometimes three annotators for ensuring quality.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134576753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Does Word2Vec encode human perception of similarityƒ A study in Bangla","authors":"Manjira Sinha, Rakesh Dutta, Tirthankar Dasgupta","doi":"10.1109/ICBSLP47725.2019.201567","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201567","url":null,"abstract":"The quest to understand how language and concepts are organized in human mind is a neverending pursuit undertaken by researchers in computational psycholinguistics; simultaneously, on the other hand, researchers have tried to quantitatively model the semantic space from written corpora and discourses through different computational approaches - while both of these interacts with each other in-terms of understanding human processing through computational linguistics and enhancing NLP methods from the insights, it has seldom been systematically studied if the two corroborates each other. In this paper, we have explored how and if the standard word embedding based semantic representation models represent the human mental lexicon. Towards that, We have conducted a semantic priming experiment to capture the psycholinguistics aspects and compared the results with a distributional word-embedding model: Bangla word2Vec. Analysis of reaction time indicates that corpus-based semantic similarity measures do not reflect the true nature of mental representation and processing of words. To the best of our knowledge this is first of a kind study in any language especially Bangla.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130379494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Basic to Compound: A Novel Transfer Learning Approach for Bengali Handwritten Character Recognition","authors":"Sakib Reza, Ohida Binte Amin, M. Hashem","doi":"10.1109/ICBSLP47725.2019.201522","DOIUrl":"https://doi.org/10.1109/ICBSLP47725.2019.201522","url":null,"abstract":"Transfer learning is widely used in various character recognition tasks. In this paper, we propose a transfer learning approach with convolutional neural network (CNN) for Bengali handwritten character recognition. When children learn the Bengali scripts, they first learn basic characters (vowels and consonants) and then go for compound characters (consonant conjuncts). Without prior knowledge of basic characters, it would be quite difficult for them to learn compound characters. In our approach, the machine mimics this human child learning process. Our study shows that CNN trained on basic characters is well capable of recognizing compound characters with minimal retraining. It performs better and also trains much faster than CNN fully trained on compound characters. Similarly, CNN trained on digits easily recognizes basic characters with a short period of training. Furthermore, pretrained CNN consistently outperforms the randomly initialized CNN while training only last few layers.","PeriodicalId":413077,"journal":{"name":"2019 International Conference on Bangla Speech and Language Processing (ICBSLP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128642994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}