Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)最新文献

AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees AstBERT:用抽象语法树实现金融代码理解的语言模型

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 2022-01-20 DOI: 10.18653/v1/2022.finnlp-1.2

Rong Liang, Tiehu Zhang, Y. Lu, Yuze Liu, Zhengqing Huang, Xin Chen

{"title":"AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees","authors":"Rong Liang, Tiehu Zhang, Y. Lu, Yuze Liu, Zhengqing Huang, Xin Chen","doi":"10.18653/v1/2022.finnlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.2","url":null,"abstract":"Using the pre-trained language models to understand source codes has attracted increasing attention from financial institutions owing to the great potential to uncover financial risks. However, there are several challenges in applying these language models to solve programming language related problems directly. For instance, the shift of domain knowledge between natural language (NL) and programming language (PL) requires understanding the semantic and syntactic information from the data from different perspectives. To this end, we propose the AstBERT model, a pre-trained PL model aiming to better understand the financial codes using the abstract syntax tree (AST). Specifically, we collect a sheer number of source codes (both Java and Python) from the Alipay code repository and incorporate both syntactic and semantic code knowledge into our model through the help of code parsers, in which AST information of the source codes can be interpreted and integrated. We evaluate the performance of the proposed model on three tasks, including code question answering, code clone detection and code refinement. Experiment results show that our AstBERT achieves promising performance on three different downstream tasks.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126033064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Disentangled Variational Topic Inference for Topic-Accurate Financial Report Generation 面向主题精确财务报告生成的解纠缠变分主题推理

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.3

Sixing Yan

引用次数: 0

Prospectus Language and IPO Performance 招股说明书语言与IPO业绩

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.21

Jared Sharpe, Keith S. Decker

引用次数: 1

Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification 基于转换器的分类充实和句子分类模型

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.34

Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan

引用次数: 1

Yet at the FinNLP-2022 ERAI Task: Modified models for evaluating the Rationales of Amateur Investors 然而，在FinNLP-2022 ERAI任务中:评估业余投资者基本原理的修正模型

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.17

Zhuang Yan, Fuji Ren

引用次数: 1

FinSim4-ESG Shared Task: Learning Semantic Similarities for the Financial Domain. Extended edition to ESG insights FinSim4-ESG共享任务:学习金融领域的语义相似性。ESG见解的扩展版

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.28

Juyeon Kang, Ismail El Maarouf

引用次数: 3

Knowledge informed sustainability detection from short financial texts 从简短的财务文本中了解可持续性检测

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.31

Boshko Koloski, Syrielle Montariol, Matthew Purver, S. Pollak

引用次数: 0

A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts 基于Reddit帖子的比特币价格预测的情绪和情绪注释数据集

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.27

Pavlo Seroyizhko, Zhanel Zhexenova, Muhammad Shafiq, Fabio Merizzi, Andrea Galassi, Federico Ruggeri

{"title":"A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts","authors":"Pavlo Seroyizhko, Zhanel Zhexenova, Muhammad Shafiq, Fabio Merizzi, Andrea Galassi, Federico Ruggeri","doi":"10.18653/v1/2022.finnlp-1.27","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.27","url":null,"abstract":"Cryptocurrencies have gained enormous momentum in finance and are nowadays commonly adopted as a medium of exchange for online payments. After recent events during which GameStop’s stocks were believed to be influenced by WallStreetBets subReddit, Reddit has become a very hot topic on the cryptocurrency market. The influence of public opinions on cryptocurrency price trends has inspired researchers on exploring solutions that integrate such information in crypto price change forecasting. A popular integration technique regards representing social media opinions via sentiment features. However, this research direction is still in its infancy, where a limited number of publicly available datasets with sentiment annotations exists. We propose a novel Bitcoin Reddit Sentiment Dataset, a ready-to-use dataset annotated with state-of-the-art sentiment and emotion recognition. The dataset contains pre-processed Reddit posts and comments about Bitcoin from several domain-related subReddits along with Bitcoin’s financial data. We evaluate several widely adopted neural architectures for crypto price change forecasting. Our results show controversial benefits of sentiment and emotion features advocating for more sophisticated social media integration techniques. We make our dataset publicly available for research.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128179725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

DigiCall: A Benchmark for Measuring the Maturity of Digital Strategy through Company Earning Calls DigiCall:通过公司盈利电话衡量数字战略成熟度的基准

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.7

Hilal Pataci, Kexuan Sun, T. Ravichandran

{"title":"DigiCall: A Benchmark for Measuring the Maturity of Digital Strategy through Company Earning Calls","authors":"Hilal Pataci, Kexuan Sun, T. Ravichandran","doi":"10.18653/v1/2022.finnlp-1.7","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.7","url":null,"abstract":"Digital transformation reinvents companies, their vision and strategy, organizational structure, processes, capabilities, and culture, and enables the development of new or enhanced products and services delivered to customers more efficiently. Organizations, by formalizing their digital strategy attempt to plan for their digital transformations and accelerate their company growth. Understanding how successful a company is in its digital transformation starts with accurate measurement of its digital maturity levels. However, existing approaches to measuring organizations’ digital strategy have low accuracy levels and this leads to inconsistent results, and also does not provide resources (data) for future research to improve. In order to measure the digital strategy maturity of companies, we leverage the state-of-the-art NLP models on unstructured data (earning call transcripts), and reach the state-of-the-art levels (94%) for this task. We release 3.691 earning call transcripts and also annotated data set, labeled particularly for the digital strategy maturity by linguists. Our work provides an empirical baseline for research in industry and management science.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification TCS WITM 2022@FinSim4-ESG:用语言和语义特征增强BERT，用于ESG数据分类

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.finnlp-1.32

Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, Lipika Dey

{"title":"TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification","authors":"Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, Lipika Dey","doi":"10.18653/v1/2022.finnlp-1.32","DOIUrl":"https://doi.org/10.18653/v1/2022.finnlp-1.32","url":null,"abstract":"Advanced neural network architectures have provided several opportunities to develop systems to automatically capture information from domain-specific unstructured text sources. The FinSim4-ESG shared task, collocated with the FinNLP workshop, proposed two sub-tasks. In sub-task1, the challenge was to design systems that could utilize contextual word embeddings along with sustainability resources to elaborate an ESG taxonomy. In the second sub-task, participants were asked to design a system that could classify sentences into sustainable or unsustainable sentences. In this paper, we utilize semantic similarity features along with BERT embeddings to segregate domain terms into a fixed number of class labels. The proposed model not only considers the contextual BERT embeddings but also incorporates Word2Vec, cosine, and Jaccard similarity which gives word-level importance to the model. For sentence classification, several linguistic elements along with BERT embeddings were used as classification features. We have shown a detailed ablation study for the proposed models.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125001406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1