FinBERT—A Deep Learning Approach to Extracting Textual Information

Allen Huang, Hui Wang, Yi Yang
{"title":"FinBERT—A Deep Learning Approach to Extracting Textual Information","authors":"Allen Huang, Hui Wang, Yi Yang","doi":"10.2139/ssrn.3910214","DOIUrl":null,"url":null,"abstract":"In this paper, we develop FinBERT, a state-of-the-art deep learning algorithm that incorporates the contextual relations between words in the finance domain. First, using a researcher-labeled analyst report sample, we document that FinBERT significantly outperforms the Loughran and McDonald (LM) dictionary, the naïve Bayes, and Word2Vec in sentiment classification, primarily because of its ability to uncover sentiment in sentences that other algorithms mislabel as neutral. Next, we show that other approaches underestimate the textual informativeness of earnings conference calls by at least 32% compared with FinBERT. Our results also indicate that FinBERT’s greater accuracy is especially relevant when empirical tests may suffer from low power, such as with small samples. Last, textual sentiments summarized by FinBERT can better predict future earnings than the LM dictionary, especially after 2011, consistent with firms’ strategic disclosures reducing the information content of textual sentiments measured with LM dictionary. Our results have implications for academic researchers, investment professionals, and financial market regulators who want to extract insights from financial texts.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics & Natural Language Processing eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3910214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

In this paper, we develop FinBERT, a state-of-the-art deep learning algorithm that incorporates the contextual relations between words in the finance domain. First, using a researcher-labeled analyst report sample, we document that FinBERT significantly outperforms the Loughran and McDonald (LM) dictionary, the naïve Bayes, and Word2Vec in sentiment classification, primarily because of its ability to uncover sentiment in sentences that other algorithms mislabel as neutral. Next, we show that other approaches underestimate the textual informativeness of earnings conference calls by at least 32% compared with FinBERT. Our results also indicate that FinBERT’s greater accuracy is especially relevant when empirical tests may suffer from low power, such as with small samples. Last, textual sentiments summarized by FinBERT can better predict future earnings than the LM dictionary, especially after 2011, consistent with firms’ strategic disclosures reducing the information content of textual sentiments measured with LM dictionary. Our results have implications for academic researchers, investment professionals, and financial market regulators who want to extract insights from financial texts.
一种提取文本信息的深度学习方法
在本文中,我们开发了FinBERT,这是一种最先进的深度学习算法,它结合了金融领域中单词之间的上下文关系。首先,使用研究人员标记的分析师报告样本,我们证明FinBERT在情感分类方面显着优于Loughran和McDonald (LM)字典,naïve贝叶斯和Word2Vec,主要是因为它能够发现其他算法错误标记为中立的句子中的情感。接下来,我们表明,与FinBERT相比,其他方法低估了财报电话会议的文本信息量至少32%。我们的结果还表明,当实证测试可能受到低功率的影响时,例如使用小样本时,FinBERT的更高准确性尤其相关。最后,FinBERT总结的文本情感比LM词典能更好地预测未来收益,特别是在2011年之后,这与企业的战略披露相一致,减少了LM词典测量的文本情感的信息含量。我们的研究结果对想要从金融文本中提取见解的学术研究人员、投资专业人士和金融市场监管者具有启示意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信