Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)最新文献

筛选
英文 中文
AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees AstBERT:用抽象语法树实现金融代码理解的语言模型
Rong Liang, Tiehu Zhang, Y. Lu, Yuze Liu, Zhengqing Huang, Xin Chen
{"title":"AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees","authors":"Rong Liang, Tiehu Zhang, Y. Lu, Yuze Liu, Zhengqing Huang, Xin Chen","doi":"10.18653/v1/2022.finnlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.2","url":null,"abstract":"Using the pre-trained language models to understand source codes has attracted increasing attention from financial institutions owing to the great potential to uncover financial risks. However, there are several challenges in applying these language models to solve programming language related problems directly. For instance, the shift of domain knowledge between natural language (NL) and programming language (PL) requires understanding the semantic and syntactic information from the data from different perspectives. To this end, we propose the AstBERT model, a pre-trained PL model aiming to better understand the financial codes using the abstract syntax tree (AST). Specifically, we collect a sheer number of source codes (both Java and Python) from the Alipay code repository and incorporate both syntactic and semantic code knowledge into our model through the help of code parsers, in which AST information of the source codes can be interpreted and integrated. We evaluate the performance of the proposed model on three tasks, including code question answering, code clone detection and code refinement. Experiment results show that our AstBERT achieves promising performance on three different downstream tasks.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126033064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Disentangled Variational Topic Inference for Topic-Accurate Financial Report Generation 面向主题精确财务报告生成的解纠缠变分主题推理
Sixing Yan
{"title":"Disentangled Variational Topic Inference for Topic-Accurate Financial Report Generation","authors":"Sixing Yan","doi":"10.18653/v1/2022.finnlp-1.3","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.3","url":null,"abstract":"Automatic generating financial report from a set of news is important but challenging. The financial reports is composed of key points of the news and corresponding inferring and reasoning from specialists in financial domain with professional knowledge. The challenges lie in the effective learning of the extra knowledge that is not well presented in the news, and the misalignment between topic of input news and output knowledge in target reports. In this work, we introduce a disentangled variational topic inference approach to learn two latent variables for news and report, respectively. We use a publicly available dataset to evaluate the proposed approach. The results demonstrate its effectiveness of enhancing the language informativeness and the topic accuracy of the generated financial reports.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129768486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prospectus Language and IPO Performance 招股说明书语言与IPO业绩
Jared Sharpe, Keith S. Decker
{"title":"Prospectus Language and IPO Performance","authors":"Jared Sharpe, Keith S. Decker","doi":"10.18653/v1/2022.finnlp-1.21","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.21","url":null,"abstract":"Pricing a firm’s Initial Public Offering (IPO) has historically been very difficult, with high average returns on the first-day of trading. Furthermore, IPO withdrawal, the event in which companies who file to go public ultimately rescind the application before the offering, is an equally challenging prediction problem. This research utilizes word embedding techniques to evaluate existing theories concerning firm sentiment on first-day trading performance and the probability of withdrawal, which has not yet been explored empirically. The results suggest that firms attempting to go public experience a decreased probability of withdrawal with the increased presence of positive, litigious, and uncertain language in their initial prospectus, while the increased presence of strong modular language leads to an increased probability of withdrawal. The results also suggest that frequent or large adjustments in the strong modular language of subsequent filings leads to smaller first-day returns.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128836014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification 基于转换器的分类充实和句子分类模型
Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan
{"title":"Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification","authors":"Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan","doi":"10.18653/v1/2022.finnlp-1.34","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.34","url":null,"abstract":"In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121610441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Yet at the FinNLP-2022 ERAI Task: Modified models for evaluating the Rationales of Amateur Investors 然而,在FinNLP-2022 ERAI任务中:评估业余投资者基本原理的修正模型
Zhuang Yan, Fuji Ren
{"title":"Yet at the FinNLP-2022 ERAI Task: Modified models for evaluating the Rationales of Amateur Investors","authors":"Zhuang Yan, Fuji Ren","doi":"10.18653/v1/2022.finnlp-1.17","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.17","url":null,"abstract":"The financial reports usually reveal the recent development of the company and often cause the volatility in the company’s share price. The opinions causing higher maximal potential profit and lower maximal loss can help the amateur investors choose rational strategies. FinNLP-2022 ERAI task aims to quantify the opinions’ potentials of leading higher maximal potential profit and lower maximal loss. In this paper, different strategies were applied to solve the ERAI tasks. Valinna ‘RoBERTa-wwm’ showed excellent performance and helped us rank second in ‘MPP’ label prediction task. After integrating some tricks, the modified ‘RoBERTa-wwm’ outperformed all other models in ‘ML’ ranking task.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126377765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FinSim4-ESG Shared Task: Learning Semantic Similarities for the Financial Domain. Extended edition to ESG insights FinSim4-ESG共享任务:学习金融领域的语义相似性。ESG见解的扩展版
Juyeon Kang, Ismail El Maarouf
{"title":"FinSim4-ESG Shared Task: Learning Semantic Similarities for the Financial Domain. Extended edition to ESG insights","authors":"Juyeon Kang, Ismail El Maarouf","doi":"10.18653/v1/2022.finnlp-1.28","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.28","url":null,"abstract":"This paper describes FinSim4-ESG 1 shared task organized in the 4th FinNLP workshopwhich is held in conjunction with the IJCAI-ECAI-2022 confer- enceThis year, the FinSim4 is extended to the Environment, Social and Government (ESG) insights and proposes two subtasks, one for ESG Taxonomy Enrichment and the other for Sustainable Sentence Prediction. Among the 28 teams registered to the shared task, a total of 8 teams submitted their systems results and 6 teams also submitted a paper to describe their method. The winner of each subtask shows good performance results of 0.85% and 0.95% in terms of accuracy, respectively.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123955628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Knowledge informed sustainability detection from short financial texts 从简短的财务文本中了解可持续性检测
Boshko Koloski, Syrielle Montariol, Matthew Purver, S. Pollak
{"title":"Knowledge informed sustainability detection from short financial texts","authors":"Boshko Koloski, Syrielle Montariol, Matthew Purver, S. Pollak","doi":"10.18653/v1/2022.finnlp-1.31","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.31","url":null,"abstract":"There is a global trend for responsible investing and the need for developing automated methods for analyzing and Environmental, Social and Governance (ESG) related elements in financial texts is raising. In this work we propose a solution to the FinSim4-ESG task, consisting of binary classification of sentences into sustainable or unsustainable. We propose a novel knowledge-based latent heterogeneous representation that is based on knowledge from taxonomies and knowledge graphs and multiple contemporary document representations. We hypothesize that an approach based on a combination of knowledge and document representations can introduce significant improvement over conventional document representation approaches. We consider ensembles on classifier as well on representation level late-fusion and early fusion. The proposed approaches achieve competitive accuracy of 89 and are 5.85 behind the best achieved score.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133289824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts 基于Reddit帖子的比特币价格预测的情绪和情绪注释数据集
Pavlo Seroyizhko, Zhanel Zhexenova, Muhammad Shafiq, Fabio Merizzi, Andrea Galassi, Federico Ruggeri
{"title":"A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts","authors":"Pavlo Seroyizhko, Zhanel Zhexenova, Muhammad Shafiq, Fabio Merizzi, Andrea Galassi, Federico Ruggeri","doi":"10.18653/v1/2022.finnlp-1.27","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.27","url":null,"abstract":"Cryptocurrencies have gained enormous momentum in finance and are nowadays commonly adopted as a medium of exchange for online payments. After recent events during which GameStop’s stocks were believed to be influenced by WallStreetBets subReddit, Reddit has become a very hot topic on the cryptocurrency market. The influence of public opinions on cryptocurrency price trends has inspired researchers on exploring solutions that integrate such information in crypto price change forecasting. A popular integration technique regards representing social media opinions via sentiment features. However, this research direction is still in its infancy, where a limited number of publicly available datasets with sentiment annotations exists. We propose a novel Bitcoin Reddit Sentiment Dataset, a ready-to-use dataset annotated with state-of-the-art sentiment and emotion recognition. The dataset contains pre-processed Reddit posts and comments about Bitcoin from several domain-related subReddits along with Bitcoin’s financial data. We evaluate several widely adopted neural architectures for crypto price change forecasting. Our results show controversial benefits of sentiment and emotion features advocating for more sophisticated social media integration techniques. We make our dataset publicly available for research.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128179725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DigiCall: A Benchmark for Measuring the Maturity of Digital Strategy through Company Earning Calls DigiCall:通过公司盈利电话衡量数字战略成熟度的基准
Hilal Pataci, Kexuan Sun, T. Ravichandran
{"title":"DigiCall: A Benchmark for Measuring the Maturity of Digital Strategy through Company Earning Calls","authors":"Hilal Pataci, Kexuan Sun, T. Ravichandran","doi":"10.18653/v1/2022.finnlp-1.7","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.7","url":null,"abstract":"Digital transformation reinvents companies, their vision and strategy, organizational structure, processes, capabilities, and culture, and enables the development of new or enhanced products and services delivered to customers more efficiently. Organizations, by formalizing their digital strategy attempt to plan for their digital transformations and accelerate their company growth. Understanding how successful a company is in its digital transformation starts with accurate measurement of its digital maturity levels. However, existing approaches to measuring organizations’ digital strategy have low accuracy levels and this leads to inconsistent results, and also does not provide resources (data) for future research to improve. In order to measure the digital strategy maturity of companies, we leverage the state-of-the-art NLP models on unstructured data (earning call transcripts), and reach the state-of-the-art levels (94%) for this task. We release 3.691 earning call transcripts and also annotated data set, labeled particularly for the digital strategy maturity by linguists. Our work provides an empirical baseline for research in industry and management science.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification TCS WITM 2022@FinSim4-ESG:用语言和语义特征增强BERT,用于ESG数据分类
Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, Lipika Dey
{"title":"TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification","authors":"Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, Lipika Dey","doi":"10.18653/v1/2022.finnlp-1.32","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.32","url":null,"abstract":"Advanced neural network architectures have provided several opportunities to develop systems to automatically capture information from domain-specific unstructured text sources. The FinSim4-ESG shared task, collocated with the FinNLP workshop, proposed two sub-tasks. In sub-task1, the challenge was to design systems that could utilize contextual word embeddings along with sustainability resources to elaborate an ESG taxonomy. In the second sub-task, participants were asked to design a system that could classify sentences into sustainable or unsustainable sentences. In this paper, we utilize semantic similarity features along with BERT embeddings to segregate domain terms into a fixed number of class labels. The proposed model not only considers the contextual BERT embeddings but also incorporates Word2Vec, cosine, and Jaccard similarity which gives word-level importance to the model. For sentence classification, several linguistic elements along with BERT embeddings were used as classification features. We have shown a detailed ablation study for the proposed models.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125001406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信