使用自然语言推理提取基于方面的财务意见

Raymond So, Chun Fai Carlin Chu, Cheuk Wing Jessie Lee
{"title":"使用自然语言推理提取基于方面的财务意见","authors":"Raymond So, Chun Fai Carlin Chu, Cheuk Wing Jessie Lee","doi":"10.1145/3543106.3543120","DOIUrl":null,"url":null,"abstract":"The emergence of transformer-based pre-trained language models (PTLMs) has bought new and improved techniques to natural language processing (NLP). Traditional rule-based NLP, for instance, is known for its deficiency of creating context-aware representations of words and sentences. Natural language inference (NLI) addresses this deficiency by using PTLMs to create context-sensitive embedding for contextual reasoning. This paper outlines a system design that uses traditional rule-based NLP and deep learning to extract aspect-based financial opinion from financial commentaries written using colloquial Cantonese, a dialect of the Chinese language used in Hong Kong. We need to confront the issue that existing off-the-shelf PTLMs, such as BERT and Roberta, are not pre-trained to understand the language semantics of colloquial Cantonese, let alone the slang, jargon, and codeword that people in Hong Kong use to articulate opinions. As a result, we approached the opinion extraction problem differently from the mainstream approaches, which use model-based named entity recognition (NER) to detect and extract opinion aspects as named entities and named entity relations. Because there is no PTLM for our specific language and problem domain, we solve the opinion extraction problem using rule-based NLP and deep learning techniques. We report our experience of creating a lexicon and identifying candidate opinion aspects in the input text using rule-based NLP. We discuss how to improve BERT’s linguistic knowledge of colloquial Cantonese through a fine-tuning procedure. We illustrate how to prepare the input text for contextual reasoning and demonstrate how to use NLI to confirm candidate opinion aspects as extractable.","PeriodicalId":150494,"journal":{"name":"Proceedings of the 2022 International Conference on E-business and Mobile Commerce","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Extract Aspect-based Financial Opinion Using Natural Language Inference\",\"authors\":\"Raymond So, Chun Fai Carlin Chu, Cheuk Wing Jessie Lee\",\"doi\":\"10.1145/3543106.3543120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emergence of transformer-based pre-trained language models (PTLMs) has bought new and improved techniques to natural language processing (NLP). Traditional rule-based NLP, for instance, is known for its deficiency of creating context-aware representations of words and sentences. Natural language inference (NLI) addresses this deficiency by using PTLMs to create context-sensitive embedding for contextual reasoning. This paper outlines a system design that uses traditional rule-based NLP and deep learning to extract aspect-based financial opinion from financial commentaries written using colloquial Cantonese, a dialect of the Chinese language used in Hong Kong. We need to confront the issue that existing off-the-shelf PTLMs, such as BERT and Roberta, are not pre-trained to understand the language semantics of colloquial Cantonese, let alone the slang, jargon, and codeword that people in Hong Kong use to articulate opinions. As a result, we approached the opinion extraction problem differently from the mainstream approaches, which use model-based named entity recognition (NER) to detect and extract opinion aspects as named entities and named entity relations. Because there is no PTLM for our specific language and problem domain, we solve the opinion extraction problem using rule-based NLP and deep learning techniques. We report our experience of creating a lexicon and identifying candidate opinion aspects in the input text using rule-based NLP. We discuss how to improve BERT’s linguistic knowledge of colloquial Cantonese through a fine-tuning procedure. We illustrate how to prepare the input text for contextual reasoning and demonstrate how to use NLI to confirm candidate opinion aspects as extractable.\",\"PeriodicalId\":150494,\"journal\":{\"name\":\"Proceedings of the 2022 International Conference on E-business and Mobile Commerce\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 International Conference on E-business and Mobile Commerce\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3543106.3543120\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on E-business and Mobile Commerce","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3543106.3543120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

基于变压器的预训练语言模型(ptlm)的出现为自然语言处理(NLP)带来了新的和改进的技术。例如,传统的基于规则的NLP在创建单词和句子的上下文感知表示方面存在缺陷。自然语言推理(NLI)通过使用ptlm为上下文推理创建上下文敏感的嵌入来解决这一缺陷。本文概述了一个系统设计,该系统使用传统的基于规则的NLP和深度学习,从使用粤语(香港使用的一种中文方言)撰写的金融评论中提取基于方面的金融意见。我们需要面对的问题是,现有现成的ptlm,如BERT和Roberta,并没有预先训练他们理解粤语口语的语言语义,更不用说香港人用来表达意见的俚语、行话和暗语了。因此,我们处理意见提取问题的方法与主流方法不同,主流方法使用基于模型的命名实体识别(NER)来检测和提取意见方面作为命名实体和命名实体关系。由于我们的特定语言和问题领域没有PTLM,因此我们使用基于规则的NLP和深度学习技术来解决意见提取问题。我们报告了使用基于规则的NLP在输入文本中创建词典和识别候选意见方面的经验。我们讨论了如何通过一个微调程序来提高BERT对粤语口语的语言知识。我们演示了如何为上下文推理准备输入文本,并演示了如何使用NLI来确认候选意见方面是可提取的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Extract Aspect-based Financial Opinion Using Natural Language Inference
The emergence of transformer-based pre-trained language models (PTLMs) has bought new and improved techniques to natural language processing (NLP). Traditional rule-based NLP, for instance, is known for its deficiency of creating context-aware representations of words and sentences. Natural language inference (NLI) addresses this deficiency by using PTLMs to create context-sensitive embedding for contextual reasoning. This paper outlines a system design that uses traditional rule-based NLP and deep learning to extract aspect-based financial opinion from financial commentaries written using colloquial Cantonese, a dialect of the Chinese language used in Hong Kong. We need to confront the issue that existing off-the-shelf PTLMs, such as BERT and Roberta, are not pre-trained to understand the language semantics of colloquial Cantonese, let alone the slang, jargon, and codeword that people in Hong Kong use to articulate opinions. As a result, we approached the opinion extraction problem differently from the mainstream approaches, which use model-based named entity recognition (NER) to detect and extract opinion aspects as named entities and named entity relations. Because there is no PTLM for our specific language and problem domain, we solve the opinion extraction problem using rule-based NLP and deep learning techniques. We report our experience of creating a lexicon and identifying candidate opinion aspects in the input text using rule-based NLP. We discuss how to improve BERT’s linguistic knowledge of colloquial Cantonese through a fine-tuning procedure. We illustrate how to prepare the input text for contextual reasoning and demonstrate how to use NLI to confirm candidate opinion aspects as extractable.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信