Multi-representation approach to text regression of financial risks

Roman Trusov, Alexey Natekin, Pavel Kalaidin, Sergey Ovcharenko, A. Knoll, Aida Fazylova
{"title":"Multi-representation approach to text regression of financial risks","authors":"Roman Trusov, Alexey Natekin, Pavel Kalaidin, Sergey Ovcharenko, A. Knoll, Aida Fazylova","doi":"10.1109/AINL-ISMW-FRUCT.2015.7382979","DOIUrl":null,"url":null,"abstract":"Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.","PeriodicalId":122232,"journal":{"name":"2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINL-ISMW-FRUCT.2015.7382979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.
金融风险文本回归的多表征方法
已经提出了不同的文本特征提取方法,从简单的词计数特征开始,继续使用捕获分布语义的更深层表示。在最近的出版物中,词嵌入方法已经成功地用作大量NLP任务的表示基础,如文本分类、词性标注等。在本文中,我们探索了在一个回归任务中同时使用多个文本表示的机会,以便利用具有更丰富语义嵌入的传统词袋方法。我们研究了这种多表示方法在金融风险预测问题上的性能。由美国贸易公司填写的公开的10-K报告被用作预测明年股价波动变化的基础。我们的研究表明,基于单一表征的模型在风险预测方面的表现与之前发表的结果相当,而具有多个表征的模型受益于互补信息,并且优于基线模型和单一表征模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信