Proceedings of The 4th Workshop on e-Commerce and NLP最新文献

筛选
英文 中文
Product Review Translation: Parallel Corpus Creation and Robustness towards User-generated Noisy Text 产品评论翻译:平行语料库创建和对用户生成的噪声文本的鲁棒性
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.ecnlp-1.21
Kamal Kumar Gupta, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal
{"title":"Product Review Translation: Parallel Corpus Creation and Robustness towards User-generated Noisy Text","authors":"Kamal Kumar Gupta, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal","doi":"10.18653/v1/2021.ecnlp-1.21","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.21","url":null,"abstract":"Reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision. Although online e-commerce portals have immensely impacted our lives, available contents predominantly are in English language- often limiting its widespread usage. There is an exponential growth in the number of e-commerce users who are not proficient in English. Hence, there is a necessity to make these services available in non-English languages, especially in a multilingual country like India. This can be achieved by an in-domain robust machine translation (MT) system. However, the reviews written by the users pose unique challenges to MT, such as misspelled words, ungrammatical constructions, presence of colloquial terms, lack of resources such as in-domain parallel corpus etc. We address the above challenges by presenting an English–Hindi review domain parallel corpus. We train an English–to–Hindi neural machine translation (NMT) system to translate the product reviews available on e-commerce websites. By training the Transformer based NMT model over the generated data, we achieve a score of 33.26 BLEU points for English–to–Hindi translation. In order to make our NMT model robust enough to handle the noisy tokens in the reviews, we integrate a character based language model to generate word vectors and map the noisy tokens with their correct forms. Experiments on four language pairs, viz. English-Hindi, English-German, English-French, and English-Czech show the BLUE scores of 35.09, 28.91, 34.68 and 14.52 which are the improvements of 1.61, 1.05, 1.63 and 1.94, respectively, over the baseline.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131661099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Combining semantic search and twin product classification for recognition of purchasable items in voice shopping 结合语义搜索和双产品分类识别语音购物中可购买物品
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.18
Dieu-Thu Le, Verena Weber, Melanie Bradford
{"title":"Combining semantic search and twin product classification for recognition of purchasable items in voice shopping","authors":"Dieu-Thu Le, Verena Weber, Melanie Bradford","doi":"10.18653/v1/2021.ecnlp-1.18","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.18","url":null,"abstract":"The accuracy of an online shopping system via voice commands is particularly important and may have a great impact on customer trust. This paper focuses on the problem of detecting if an utterance contains actual and purchasable products, thus referring to a shopping-related intent in a typical Spoken Language Understanding architecture consist- ing of an intent classifier and a slot detec- tor. Searching through billions of products to check if a detected slot is a purchasable item is prohibitively expensive. To overcome this problem, we present a framework that (1) uses a retrieval module that returns the most rele- vant products with respect to the detected slot, and (2) combines it with a twin network that decides if the detected slot is indeed a pur- chasable item or not. Through various exper- iments, we show that this architecture outper- forms a typical slot detector approach, with a gain of +81% in accuracy and +41% in F1 score.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116778767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attribute Value Generation from Product Title using Language Models 使用语言模型从产品标题生成属性值
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.2
Kalyani Roy, Pawan Goyal, Manish Pandey
{"title":"Attribute Value Generation from Product Title using Language Models","authors":"Kalyani Roy, Pawan Goyal, Manish Pandey","doi":"10.18653/v1/2021.ecnlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.2","url":null,"abstract":"Identifying the value of product attribute is essential for many e-commerce functions such as product search and product recommendations. Therefore, identifying attribute values from unstructured product descriptions is a critical undertaking for any e-commerce retailer. What makes this problem challenging is the diversity of product types and their attributes and values. Existing methods have typically employed multiple types of machine learning models, each of which handles specific product types or attribute classes. This has limited their scalability and generalization for large scale real world e-commerce applications. Previous approaches for this task have formulated the attribute value extraction as a Named Entity Recognition (NER) task or a Question Answering (QA) task. In this paper we have presented a generative approach to the attribute value extraction problem using language models. We leverage the large-scale pretraining of the GPT-2 and the T5 text-to-text transformer to create fine-tuned models that can effectively perform this task. We show that a single general model is very effective for this task over a broad set of product attribute values with the open world assumption. Our approach achieves state-of-the-art performance for different attribute classes, which has previously required a diverse set of models.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133708413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Unsupervised Class-Specific Abstractive Summarization of Customer Reviews 客户评论的非监督类特定抽象摘要
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.11
Thi Thuy Anh Nguyen, Mingwei Shen, K. Hovsepian
{"title":"Unsupervised Class-Specific Abstractive Summarization of Customer Reviews","authors":"Thi Thuy Anh Nguyen, Mingwei Shen, K. Hovsepian","doi":"10.18653/v1/2021.ecnlp-1.11","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.11","url":null,"abstract":"Large-scale unsupervised abstractive summarization is sorely needed to automatically scan millions of customer reviews in today’s fast-paced e-commerce landscape. We address a key challenge in unsupervised abstractive summarization – reducing generic and uninformative content and producing useful information that relates to specific product aspects. To do so, we propose to model reviews in the context of some topical classes of interest. In particular, for any arbitrary set of topical classes of interest, the proposed model can learn to generate a set of class-specific summaries from multiple reviews of each product without ground-truth summaries, and the only required signal is class probabilities or class label for each review. The model combines a generative variational autoencoder, with an integrated class-correlation gating mechanism and a hierarchical structure capturing dependence among products, reviews and classes. Human evaluation shows that generated summaries are highly relevant, fluent, and representative. Evaluation using a reference dataset shows that our model outperforms state-of-the-art abstractive and extractive baselines.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122804421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
SupportNet: Neural Networks for Summary Generation and Key Segment Extraction from Technical Support Tickets 从技术支持票中生成摘要和关键段提取的神经网络
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.20
Vinayshekhar Bannihatti Kumar, Mohan Yarramsetty, Sharon Sun, Anukul Goel
{"title":"SupportNet: Neural Networks for Summary Generation and Key Segment Extraction from Technical Support Tickets","authors":"Vinayshekhar Bannihatti Kumar, Mohan Yarramsetty, Sharon Sun, Anukul Goel","doi":"10.18653/v1/2021.ecnlp-1.20","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.20","url":null,"abstract":"We improve customer experience and gain their trust when their issues are resolved rapidly with less friction. Existing work has focused on reducing the overall case resolution time by binning a case into predefined categories and routing it to the desired support engineer. However, the actions taken by the engineer during case analysis and resolution are altogether ignored, even though it forms the bulk of the case resolution time. In this work, we propose two systems that enable support engineers to resolve cases faster. The first, a guidance extraction model, mines historical cases and provides technical guidance phrases to the support engineers. The phrases can then be used to educate the customer or to obtain critical information needed to resolve the case and thus minimize the number of correspondences between the engineer and customer. The second, a summarization model, creates an abstractive summary of the case to provide better context to the support engineer. Through quantitative evaluation we obtain an F1 score of 0.64 on the guidance extraction model and a BertScore (F1) of 0.55 on the summarization model.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123217101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detect Profane Language in Streaming Services to Protect Young Audiences 检测流媒体服务中的亵渎语言,以保护年轻观众
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.15
Jingxiang Chen, Kaimin Wei, Xiang Hao
{"title":"Detect Profane Language in Streaming Services to Protect Young Audiences","authors":"Jingxiang Chen, Kaimin Wei, Xiang Hao","doi":"10.18653/v1/2021.ecnlp-1.15","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.15","url":null,"abstract":"With the rapid growth of online video streaming, recent years have seen increasing concerns about profane language in their content. Detecting profane language in streaming services is challenging due to the long sentences appeared in a video. While recent research on handling long sentences has focused on developing deep learning modeling techniques, little work has focused on techniques on improving data pipelines. In this work, we develop a data collection pipeline to address long sequence of texts and integrate this pipeline with a multi-head self-attention model. With this pipeline, our experiments show the self-attention model offers 12.5% relative accuracy improvement over state-of-the-art distilBERT model on profane language detection while requiring only 3% of parameters. This research designs a better system for informing users of profane language in video streaming services.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130772207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation 探索产品适度的数据编程管道中的灵感集
Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.16
Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson
{"title":"Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation","authors":"Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson","doi":"10.18653/v1/2021.ecnlp-1.16","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.16","url":null,"abstract":"We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115237477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信