提高情感分析多语言分类方法的效率和有效性

2021 Systems and Information Engineering Design Symposium (SIEDS) Pub Date : 2021-04-30 DOI:10.1109/SIEDS52267.2021.9483767

Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown

{"title":"提高情感分析多语言分类方法的效率和有效性","authors":"Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown","doi":"10.1109/SIEDS52267.2021.9483767","DOIUrl":null,"url":null,"abstract":"The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.","PeriodicalId":426747,"journal":{"name":"2021 Systems and Information Engineering Design Symposium (SIEDS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Efficiency and Effectiveness of Multilingual Classification Methods for Sentiment Analysis\",\"authors\":\"Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown\",\"doi\":\"10.1109/SIEDS52267.2021.9483767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.\",\"PeriodicalId\":426747,\"journal\":{\"name\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS52267.2021.9483767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS52267.2021.9483767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

不断发展的客户体验管理领域在很大程度上依赖于自然语言处理(NLP)。在这个行业中，目前NLP的一个重要用途是有效地用新语言构建情感模型。这些新的语言模型将允许访问更大范围的客户。在这项工作中，我们研究了迁移学习方法的实际有效性和训练数据需求，特别是mBERT和XLM-RoBERTa，用于开发德语情感分析模型。为了提供排除迁移学习的有意义的比较，我们还使用并训练了一个LSTM分类模型。通过研究不同数量的目标语言训练数据的性能增益，对模型进行了测试。该结果允许对所需精度的数据需求进行预测，从而能够有效地构建NLP模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Efficiency and Effectiveness of Multilingual Classification Methods for Sentiment Analysis

The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 Systems and Information Engineering Design Symposium (SIEDS)

自引率

0.00%

发文量