{"title":"The Global Banking Standards QA Dataset (GBS-QA)","authors":"Kyung-Woo Sohn, Sunjae Kwon, Jaesik Choi","doi":"10.18653/v1/2021.econlp-1.3","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.3","url":null,"abstract":"A domain specific question answering (QA) dataset dramatically improves the machine comprehension performance. This paper presents a new Global Banking Standards QA dataset (GBS-QA) in the banking regulation domain. The GBS-QA has three values. First, it contains actual questions from market players and answers from global rule setter, the Basel Committee on Banking Supervision (BCBS) in the middle of creating and revising banking regulations. Second, financial regulation experts analyze and verify pairs of questions and answers in the annotation process. Lastly, the GBS-QA is a totally different dataset with existing datasets in finance and is applicable to stimulate transfer learning research in the banking regulation domain.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125117492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fine-Grained Annotated Corpus for Target-Based Opinion Analysis of Economic and Financial Narratives","authors":"Jiahui Hu, P. Paroubek","doi":"10.18653/v1/2021.econlp-1.1","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.1","url":null,"abstract":"In this paper about aspect-based sentiment analysis (ABSA), we present the first version of a fine-grained annotated corpus for target-based opinion analysis (TBOA) to analyze economic activities or financial markets. We have annotated, at an intra-sentential level, a corpus of sentences extracted from documents representative of financial analysts’ most-read materials by considering how financial actors communicate about the evolution of event trends and analyze related publications (news, official communications, etc.). Since we focus on identifying the expressions of opinions related to the economy and financial markets, we annotated the sentences that contain at least one subjective expression about a domain-specific term. Candidate sentences for annotations were randomly chosen from texts of specialized press and professional information channels over a period ranging from 1986 to 2021. Our annotation scheme relies on various linguistic markers like domain-specific vocabulary, syntactic structures, and rhetorical relations to explicitly describe the author’s subjective stance. We investigated and evaluated the recourse to automatic pre-annotation with existing natural language processing technologies to alleviate the annotation workload. Our aim is to propose a corpus usable on the one hand as training material for the automatic detection of the opinions expressed on an extensive range of domain-specific aspects and on the other hand as a gold standard for evaluation TBOA. In this paper, we present our pre-annotation models and evaluations of their performance, introduce our annotation scheme and report on the main characteristics of our corpus.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124516667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cryptocurrency Day Trading and Framing Prediction in Microblog Discourse","authors":"Anna Paula Pawlicka Maule, Kristen Marie Johnson","doi":"10.18653/v1/2021.econlp-1.11","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.11","url":null,"abstract":"With 56 million people actively trading and investing in cryptocurrency online and globally in 2020, there is an increasing need for automatic social media analysis tools to help understand trading discourse and behavior. In this work, we present a dual natural language modeling pipeline which leverages language and social network behaviors for the prediction of cryptocurrency day trading actions and their associated framing patterns. This pipeline first predicts if tweets can be used to guide day trading behavior, specifically if a cryptocurrency investor should buy, sell, or hold their cryptocurrencies in order to make a profit. Next, tweets are input to an unsupervised deep clustering approach to automatically detect trading framing patterns. Our contributions include the modeling pipeline for this novel task, a new Cryptocurrency Tweets Dataset compiled from influential accounts, and a Historical Price Dataset. Our experiments show that our approach achieves an 88.78% accuracy for day trading behavior prediction and reveals framing fluctuations prior to and during the COVID-19 pandemic that could be used to guide investment actions.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114693420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"To What Extent Can English-as-a-Second Language Learners Read Economic News Texts?","authors":"Yo Ehara","doi":"10.18653/v1/2021.econlp-1.9","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.9","url":null,"abstract":"In decision making in the economic field, an especially important requirement is to rapidly understand news to absorb ever-changing economic situations. Given that most economic news is written in English, the ability to read such information without waiting for a translation is particularly valuable in economics in contrast to other fields. In consideration of this issue, this research investigated the extent to which non-native English speakers are able to read economic news to make decisions accordingly – an issue that has been rarely addressed in previous studies. Using an existing standard dataset as training data, we created a classifier that automatically evaluates the readability of text with high accuracy for English learners. Our assessment of the readability of an economic news corpus revealed that most news texts can be read by intermediate English learners. We also found that in some cases, readability varies considerably depending on the knowledge of certain words specific to the economic field.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134166567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corporate Bankruptcy Prediction with Domain-Adapted BERT","authors":"Alex G. Kim, Sang-Yeong Yoon","doi":"10.18653/v1/2021.econlp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.4","url":null,"abstract":"This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we employ BERT model to perform sentiment analysis on MD&A disclosures. We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in logistic regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). Further, instead of pre-training the BERT model from scratch, we apply self-learning with confidence-based filtering to corporate disclosure data (10-K). We achieve the accuracy rate of 91.56% and demonstrate that the domain adaptation procedure brings a significant improvement in prediction accuracy.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114647298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Word Embedding to Reveal Monetary Policy Explanation Changes","authors":"Akira Matsui, Xiang Ren, Emilio Ferrara","doi":"10.18653/v1/2021.econlp-1.8","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.8","url":null,"abstract":"Documents have been an essential tool of communication for governments to announce their policy operations. Most policy announcements have taken the form of text to inform their new policies or changes to the public. To understand such policymakers’ communication, many researchers exploit published policy documents. However, the methods well-used in other research domains such as sentiment analysis or topic modeling are not suitable for studying policy communications. Their training corpora and methods are not for policy documents where technical terminologies are used, and sentiment expressions are refrained. We leverage word embedding techniques to extract semantic changes in the monetary policy documents. Our empirical study shows that the policymaker uses different semantics according to the type of documents when they change their policy.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130847136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Stock Prediction to Financial Relevance: Repurposing Attention Weights to Assess News Relevance Without Manual Annotations","authors":"Luciano Del Corro, Johannes Hoffart","doi":"10.18653/v1/2021.econlp-1.6","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.6","url":null,"abstract":"We present a method to automatically identify financially relevant news using stock price movements and news headlines as input. The method repurposes the attention weights of a neural network initially trained to predict stock prices to assign a relevance score to each headline, eliminating the need for manually labeled training data. Our experiments on the four most relevant US stock indices and 1.5M news headlines show that the method ranks relevant news highly, positively correlated with the accuracy of the initial stock price prediction task.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116179708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting Economic Signals from Central Bank Speeches","authors":"M. Ahrens, Michael McMahon","doi":"10.18653/v1/2021.econlp-1.12","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.12","url":null,"abstract":"Estimating the effects of monetary policy is one of the fundamental research questions in monetary economics. Many economies are facing ultra-low interest rate environments ever since the global financial crisis of 2007-9. The Covid pandemic recently reinforced this situation. In the US and Europe, interest rates are close to (or even below) zero, which limits the scope of traditional monetary policy measures for central banks. Dedicated central bank communication has hence become an increasingly important tool to steer and control market expectations these days. However, incorporating central bank language directly as features into economic models is still a very nascent research area. In particular, the content and effect of central bank speeches has been mostly neglected from monetary policy modelling so far. With our paper, we aim to provide to the research community a novel, monetary policy shock series based on central bank speeches. We use a supervised topic modeling approach that can deal with text as well as numeric covariates to estimate a monetary policy signal dispersion index along three key economic dimensions: GDP, CPI and unemployment. This “dispersion shock” series is not only more frequent than series that classically focus on policy announcement dates, it also opens up the possibility of answering new questions that have up until now been difficult to analyse. For example, do markets form different expectations when facing a “cacophony of policy voices”? Our initial findings for the US point towards the fact that more dispersed or incongruent monetary policy stance communication in the build up to Federal Open Market Committee (FOMC) meetings might be associated with stronger subsequent market surprises at FOMC policy announcement time.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121057658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Peng, Emmanuele Chersoni, Yu-Yin Hsu, Chu-Ren Huang
{"title":"Is Domain Adaptation Worth Your Investment? Comparing BERT and FinBERT on Financial Tasks","authors":"Bo Peng, Emmanuele Chersoni, Yu-Yin Hsu, Chu-Ren Huang","doi":"10.18653/v1/2021.econlp-1.5","DOIUrl":"https://doi.org/10.18653/v1/2021.econlp-1.5","url":null,"abstract":"With the recent rise in popularity of Transformer models in Natural Language Processing, research efforts have been dedicated to the development of domain-adapted versions of BERT-like architectures. In this study, we focus on FinBERT, a Transformer model trained on text from the financial domain. By comparing its performances with the original BERT on a wide variety of financial text processing tasks, we found continual pretraining from the original model to be the more beneficial option. Domain-specific pretraining from scratch, conversely, seems to be less effective.","PeriodicalId":166554,"journal":{"name":"Proceedings of the Third Workshop on Economics and Natural Language Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130444084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}