缅甸文本中电子商务产品标题分类的监督机器学习模型比较

Khin Yee Mon Thant, K. Nwet
{"title":"缅甸文本中电子商务产品标题分类的监督机器学习模型比较","authors":"Khin Yee Mon Thant, K. Nwet","doi":"10.1109/ICAIT51105.2020.9261779","DOIUrl":null,"url":null,"abstract":"Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.","PeriodicalId":173291,"journal":{"name":"2020 International Conference on Advanced Information Technologies (ICAIT)","volume":"52 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparison of Supervised Machine Learning Models for Categorizing E-Commerce Product Titles in Myanmar Text\",\"authors\":\"Khin Yee Mon Thant, K. Nwet\",\"doi\":\"10.1109/ICAIT51105.2020.9261779\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.\",\"PeriodicalId\":173291,\"journal\":{\"name\":\"2020 International Conference on Advanced Information Technologies (ICAIT)\",\"volume\":\"52 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Advanced Information Technologies (ICAIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIT51105.2020.9261779\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Advanced Information Technologies (ICAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIT51105.2020.9261779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

尽管近年来缅甸的在线业务数量不断增加,但缅甸语电子商务网站的数量仍然非常少。原因是电子商务网站发展迅速,需要自动产品分类等方法来获得更好的体验,但目前还没有现成的缅甸文本系统。由于缺乏真实世界的数据集,这一领域的研究受到限制。本文开发了一个电子商务产品语料库,并对支持向量机(SVM)、随机森林(RF)、朴素贝叶斯(NB)和逻辑回归(LR)等不同的监督机器学习算法进行了评估,用于缅甸语产品标题的分类。他们从缅甸的一个电子商务网站收集了30多万个产品标题,并将其划分为15个预定义的类别。实验结果表明,与其他有监督机器学习算法相比,支持向量机获得了最好的准确率结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparison of Supervised Machine Learning Models for Categorizing E-Commerce Product Titles in Myanmar Text
Although the number of online businesses is increasing in Myanmar during recent years, the number of e-commerce sites in Myanmar language is still very low. The reason is that e-commerce sites are growing rapidly and methods like automatic product categorization are needed for better experience but there is no off-the-shelf system for Myanmar text till now. Research in this area is limited by the lack of real world datasets. In this paper, an e-commerce product corpus was developed and different supervised machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) were evaluated for classification of product titles in Myanmar Language. Over 300,000 product titles were scraped from a Myanmar e-commerce site and classified into 15 predefined categories. The experimental results show that Support Vector Machine gets the best accuracy result compared to other supervised machine learning algorithms in this paper.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信