Silvia García-Méndez, Francisco de Arriba-Pérez, Ana Barros-Vila, Francisco J. González-Castaño
{"title":"Detection of Temporality at Discourse Level on Financial News by Combining Natural Language Processing and Machine Learning","authors":"Silvia García-Méndez, Francisco de Arriba-Pérez, Ana Barros-Vila, Francisco J. González-Castaño","doi":"arxiv-2404.01337","DOIUrl":null,"url":null,"abstract":"Finance-related news such as Bloomberg News, CNN Business and Forbes are\nvaluable sources of real data for market screening systems. In news, an expert\nshares opinions beyond plain technical analyses that include context such as\npolitical, sociological and cultural factors. In the same text, the expert\noften discusses the performance of different assets. Some key statements are\nmere descriptions of past events while others are predictions. Therefore,\nunderstanding the temporality of the key statements in a text is essential to\nseparate context information from valuable predictions. We propose a novel\nsystem to detect the temporality of finance-related news at discourse level\nthat combines Natural Language Processing and Machine Learning techniques, and\nexploits sophisticated features such as syntactic and semantic dependencies.\nMore specifically, we seek to extract the dominant tenses of the main\nstatements, which may be either explicit or implicit. We have tested our system\non a labelled dataset of finance-related news annotated by researchers with\nknowledge in the field. Experimental results reveal a high detection precision\ncompared to an alternative rule-based baseline approach. Ultimately, this\nresearch contributes to the state-of-the-art of market screening by identifying\npredictive knowledge for financial decision making.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Statistical Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2404.01337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Finance-related news such as Bloomberg News, CNN Business and Forbes are
valuable sources of real data for market screening systems. In news, an expert
shares opinions beyond plain technical analyses that include context such as
political, sociological and cultural factors. In the same text, the expert
often discusses the performance of different assets. Some key statements are
mere descriptions of past events while others are predictions. Therefore,
understanding the temporality of the key statements in a text is essential to
separate context information from valuable predictions. We propose a novel
system to detect the temporality of finance-related news at discourse level
that combines Natural Language Processing and Machine Learning techniques, and
exploits sophisticated features such as syntactic and semantic dependencies.
More specifically, we seek to extract the dominant tenses of the main
statements, which may be either explicit or implicit. We have tested our system
on a labelled dataset of finance-related news annotated by researchers with
knowledge in the field. Experimental results reveal a high detection precision
compared to an alternative rule-based baseline approach. Ultimately, this
research contributes to the state-of-the-art of market screening by identifying
predictive knowledge for financial decision making.
彭博新闻社、CNN Business 和《福布斯》等与金融相关的新闻是市场筛选系统宝贵的真实数据来源。在新闻中,专家分享的观点不仅仅是简单的技术分析,还包括政治、社会和文化因素等背景。在同一篇文章中,专家经常讨论不同资产的表现。有些关键言论只是对过去事件的描述,而有些则是预测。因此,理解文本中关键语句的时间性对于将上下文信息与有价值的预测区分开来至关重要。我们提出了一种在话语层面检测金融相关新闻时间性的新系统,该系统结合了自然语言处理和机器学习技术,并利用了句法和语义依赖性等复杂特征。我们在由该领域研究人员标注的金融相关新闻标签数据集上测试了我们的系统。实验结果表明,与其他基于规则的基线方法相比,我们的系统具有很高的检测精度。最终,这项研究通过识别金融决策的预测性知识,为市场筛选的先进水平做出了贡献。