Proceedings of The 4th Workshop on e-Commerce and NLP最新文献

Product Review Translation: Parallel Corpus Creation and Robustness towards User-generated Noisy Text 产品评论翻译:平行语料库创建和对用户生成的噪声文本的鲁棒性

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.ecnlp-1.21

Kamal Kumar Gupta, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal

{"title":"Product Review Translation: Parallel Corpus Creation and Robustness towards User-generated Noisy Text","authors":"Kamal Kumar Gupta, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal","doi":"10.18653/v1/2021.ecnlp-1.21","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.21","url":null,"abstract":"Reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision. Although online e-commerce portals have immensely impacted our lives, available contents predominantly are in English language- often limiting its widespread usage. There is an exponential growth in the number of e-commerce users who are not proficient in English. Hence, there is a necessity to make these services available in non-English languages, especially in a multilingual country like India. This can be achieved by an in-domain robust machine translation (MT) system. However, the reviews written by the users pose unique challenges to MT, such as misspelled words, ungrammatical constructions, presence of colloquial terms, lack of resources such as in-domain parallel corpus etc. We address the above challenges by presenting an English–Hindi review domain parallel corpus. We train an English–to–Hindi neural machine translation (NMT) system to translate the product reviews available on e-commerce websites. By training the Transformer based NMT model over the generated data, we achieve a score of 33.26 BLEU points for English–to–Hindi translation. In order to make our NMT model robust enough to handle the noisy tokens in the reviews, we integrate a character based language model to generate word vectors and map the noisy tokens with their correct forms. Experiments on four language pairs, viz. English-Hindi, English-German, English-French, and English-Czech show the BLUE scores of 35.09, 28.91, 34.68 and 14.52 which are the improvements of 1.61, 1.05, 1.63 and 1.94, respectively, over the baseline.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131661099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Combining semantic search and twin product classification for recognition of purchasable items in voice shopping 结合语义搜索和双产品分类识别语音购物中可购买物品

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.18

Dieu-Thu Le, Verena Weber, Melanie Bradford

引用次数: 0

Attribute Value Generation from Product Title using Language Models 使用语言模型从产品标题生成属性值

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.2

Kalyani Roy, Pawan Goyal, Manish Pandey

{"title":"Attribute Value Generation from Product Title using Language Models","authors":"Kalyani Roy, Pawan Goyal, Manish Pandey","doi":"10.18653/v1/2021.ecnlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.2","url":null,"abstract":"Identifying the value of product attribute is essential for many e-commerce functions such as product search and product recommendations. Therefore, identifying attribute values from unstructured product descriptions is a critical undertaking for any e-commerce retailer. What makes this problem challenging is the diversity of product types and their attributes and values. Existing methods have typically employed multiple types of machine learning models, each of which handles specific product types or attribute classes. This has limited their scalability and generalization for large scale real world e-commerce applications. Previous approaches for this task have formulated the attribute value extraction as a Named Entity Recognition (NER) task or a Question Answering (QA) task. In this paper we have presented a generative approach to the attribute value extraction problem using language models. We leverage the large-scale pretraining of the GPT-2 and the T5 text-to-text transformer to create fine-tuned models that can effectively perform this task. We show that a single general model is very effective for this task over a broad set of product attribute values with the open world assumption. Our approach achieves state-of-the-art performance for different attribute classes, which has previously required a diverse set of models.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":"553 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133708413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Unsupervised Class-Specific Abstractive Summarization of Customer Reviews 客户评论的非监督类特定抽象摘要

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.11

Thi Thuy Anh Nguyen, Mingwei Shen, K. Hovsepian

{"title":"Unsupervised Class-Specific Abstractive Summarization of Customer Reviews","authors":"Thi Thuy Anh Nguyen, Mingwei Shen, K. Hovsepian","doi":"10.18653/v1/2021.ecnlp-1.11","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.11","url":null,"abstract":"Large-scale unsupervised abstractive summarization is sorely needed to automatically scan millions of customer reviews in today’s fast-paced e-commerce landscape. We address a key challenge in unsupervised abstractive summarization – reducing generic and uninformative content and producing useful information that relates to specific product aspects. To do so, we propose to model reviews in the context of some topical classes of interest. In particular, for any arbitrary set of topical classes of interest, the proposed model can learn to generate a set of class-specific summaries from multiple reviews of each product without ground-truth summaries, and the only required signal is class probabilities or class label for each review. The model combines a generative variational autoencoder, with an integrated class-correlation gating mechanism and a hierarchical structure capturing dependence among products, reviews and classes. Human evaluation shows that generated summaries are highly relevant, fluent, and representative. Evaluation using a reference dataset shows that our model outperforms state-of-the-art abstractive and extractive baselines.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122804421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

SupportNet: Neural Networks for Summary Generation and Key Segment Extraction from Technical Support Tickets 从技术支持票中生成摘要和关键段提取的神经网络

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.20

Vinayshekhar Bannihatti Kumar, Mohan Yarramsetty, Sharon Sun, Anukul Goel

{"title":"SupportNet: Neural Networks for Summary Generation and Key Segment Extraction from Technical Support Tickets","authors":"Vinayshekhar Bannihatti Kumar, Mohan Yarramsetty, Sharon Sun, Anukul Goel","doi":"10.18653/v1/2021.ecnlp-1.20","DOIUrl":"https://doi.org/10.18653/v1/2021.ecnlp-1.20","url":null,"abstract":"We improve customer experience and gain their trust when their issues are resolved rapidly with less friction. Existing work has focused on reducing the overall case resolution time by binning a case into predefined categories and routing it to the desired support engineer. However, the actions taken by the engineer during case analysis and resolution are altogether ignored, even though it forms the bulk of the case resolution time. In this work, we propose two systems that enable support engineers to resolve cases faster. The first, a guidance extraction model, mines historical cases and provides technical guidance phrases to the support engineers. The phrases can then be used to educate the customer or to obtain critical information needed to resolve the case and thus minimize the number of correspondences between the engineer and customer. The second, a summarization model, creates an abstractive summary of the case to provide better context to the support engineer. Through quantitative evaluation we obtain an F1 score of 0.64 on the guidance extraction model and a BertScore (F1) of 0.55 on the summarization model.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123217101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Detect Profane Language in Streaming Services to Protect Young Audiences 检测流媒体服务中的亵渎语言，以保护年轻观众

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.15

Jingxiang Chen, Kaimin Wei, Xiang Hao

引用次数: 1

Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation 探索产品适度的数据编程管道中的灵感集

Proceedings of The 4th Workshop on e-Commerce and NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.ecnlp-1.16

Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson

引用次数: 0