提高情感分析多语言分类方法的效率和有效性

Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown
{"title":"提高情感分析多语言分类方法的效率和有效性","authors":"Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown","doi":"10.1109/SIEDS52267.2021.9483767","DOIUrl":null,"url":null,"abstract":"The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.","PeriodicalId":426747,"journal":{"name":"2021 Systems and Information Engineering Design Symposium (SIEDS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Efficiency and Effectiveness of Multilingual Classification Methods for Sentiment Analysis\",\"authors\":\"Pantea Ferdosian, Sean Grace, Vasudha Manikandan, Lucas Moles, Debajyoti Datta, Donald E. Brown\",\"doi\":\"10.1109/SIEDS52267.2021.9483767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.\",\"PeriodicalId\":426747,\"journal\":{\"name\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS52267.2021.9483767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS52267.2021.9483767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

不断发展的客户体验管理领域在很大程度上依赖于自然语言处理(NLP)。在这个行业中,目前NLP的一个重要用途是有效地用新语言构建情感模型。这些新的语言模型将允许访问更大范围的客户。在这项工作中,我们研究了迁移学习方法的实际有效性和训练数据需求,特别是mBERT和XLM-RoBERTa,用于开发德语情感分析模型。为了提供排除迁移学习的有意义的比较,我们还使用并训练了一个LSTM分类模型。通过研究不同数量的目标语言训练数据的性能增益,对模型进行了测试。该结果允许对所需精度的数据需求进行预测,从而能够有效地构建NLP模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving the Efficiency and Effectiveness of Multilingual Classification Methods for Sentiment Analysis
The growing field of customer experience management relies heavily on natural language processing (NLP). An important current use of NLP in this industry is to efficiently build sentiment models in new languages. These new language models will allow access to a greater range of clients. In this work, we examine the practical effectiveness and training data requirements of transfer learning methods, specifically mBERT and XLM-RoBERTa, for developing sentiment analysis models in German. To provide a meaningful comparison that excludes transfer learning, we also utilize and train an LSTM classification model. The models are tested by studying the performance gains for different amounts of target language training data. The results enable efficient building of NLP models by allowing prediction of the data requirements for a desired accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信