{"title":"缅甸文本中电子商务产品标题分类的监督机器学习模型比较","authors":"Khin Yee Mon Thant, K. Nwet","doi":"10.1109/ICAIT51105.2020.9261779","DOIUrl":null,"url":null,"abstract":"Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.","PeriodicalId":173291,"journal":{"name":"2020 International Conference on Advanced Information Technologies (ICAIT)","volume":"52 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparison of Supervised Machine Learning Models for Categorizing E-Commerce Product Titles in Myanmar Text\",\"authors\":\"Khin Yee Mon Thant, K. Nwet\",\"doi\":\"10.1109/ICAIT51105.2020.9261779\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.\",\"PeriodicalId\":173291,\"journal\":{\"name\":\"2020 International Conference on Advanced Information Technologies (ICAIT)\",\"volume\":\"52 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Advanced Information Technologies (ICAIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIT51105.2020.9261779\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Advanced Information Technologies (ICAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIT51105.2020.9261779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparison of Supervised Machine Learning Models for Categorizing E-Commerce Product Titles in Myanmar Text
Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.