The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities

Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets Pub Date : 2018-06-15 DOI:10.1145/3220547.3220553

M. Loster, Tim Repke, Ralf Krestel, Felix Naumann, Jan Ehmueller, Benjamin Feldmann, Oliver Maspfuhl

{"title":"The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities","authors":"M. Loster, Tim Repke, Ralf Krestel, Felix Naumann, Jan Ehmueller, Benjamin Feldmann, Oliver Maspfuhl","doi":"10.1145/3220547.3220553","DOIUrl":null,"url":null,"abstract":"1 OVERVIEW & MOTIVATION The integration of a wide range of structured and unstructured information sources into a uniformly integrated knowledge base is an important task in the nancial sector. As an example, modern risk analysis methods can bene t greatly from an integrated knowledge base, building in particular a dedicated, domain-speci c knowledge graph. Knowledge graphs can be used to gain a holistic view of the current economic situation so that systemic risks can be identi ed early enough to react appropriately. The use of this graphical structure thus allows the investigation of many nancial scenarios, such as the impact of corporate bankruptcy on other market participants within the network. In this particular scenario, the links between the individual market participants can be used to determine which companies are a ected by a bankruptcy and to what extent. We took these considerations as a motivation to start the development of a system capable of constructing and maintaining a knowledge graph of nancial entities and their relationships. The envisioned system generates this particular graph by extracting and combining information from both structured data sources such as Wikidata and DBpedia, as well as from unstructured data sources such as newspaper articles and nancial lings. In addition, the system should incorporate proprietary data sources, such as nancial transactions (structured) and credit reports (unstructured). The ultimate goal is to create a system that recognizes nancial entities in structured and unstructured sources, links them with the information of a knowledge base, and then extracts the relations expressed in the text between the identi ed entities. The constructed knowledge base can be used to construct the desired knowledge graph. Our system design consists of several components, each of which addresses a speci c subproblem. To this end, Figure 1 gives a general overview of our system and its subcomponents.","PeriodicalId":161670,"journal":{"name":"Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3220547.3220553","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

1 OVERVIEW & MOTIVATION The integration of a wide range of structured and unstructured information sources into a uniformly integrated knowledge base is an important task in the nancial sector. As an example, modern risk analysis methods can bene t greatly from an integrated knowledge base, building in particular a dedicated, domain-speci c knowledge graph. Knowledge graphs can be used to gain a holistic view of the current economic situation so that systemic risks can be identi ed early enough to react appropriately. The use of this graphical structure thus allows the investigation of many nancial scenarios, such as the impact of corporate bankruptcy on other market participants within the network. In this particular scenario, the links between the individual market participants can be used to determine which companies are a ected by a bankruptcy and to what extent. We took these considerations as a motivation to start the development of a system capable of constructing and maintaining a knowledge graph of nancial entities and their relationships. The envisioned system generates this particular graph by extracting and combining information from both structured data sources such as Wikidata and DBpedia, as well as from unstructured data sources such as newspaper articles and nancial lings. In addition, the system should incorporate proprietary data sources, such as nancial transactions (structured) and credit reports (unstructured). The ultimate goal is to create a system that recognizes nancial entities in structured and unstructured sources, links them with the information of a knowledge base, and then extracts the relations expressed in the text between the identi ed entities. The constructed knowledge base can be used to construct the desired knowledge graph. Our system design consists of several components, each of which addresses a speci c subproblem. To this end, Figure 1 gives a general overview of our system and its subcomponents.

查看原文本刊更多论文

创建、维护和探索金融实体图的挑战

将广泛的结构化和非结构化信息源集成为统一集成的知识库是金融领域的一项重要任务。例如，现代风险分析方法可以从集成的知识库中受益匪浅，特别是构建专用的、特定于领域的知识图谱。知识图谱可以用来获得对当前经济形势的整体看法，以便及早发现系统性风险，做出适当的反应。因此，使用这种图形结构可以对许多金融场景进行调查，例如公司破产对网络内其他市场参与者的影响。在这种特殊情况下，单个市场参与者之间的联系可以用来确定哪些公司受到破产的影响，以及在多大程度上受到影响。我们将这些考虑作为开始开发一个能够构建和维护金融实体及其关系的知识图谱的系统的动机。所设想的系统通过从结构化数据源(如Wikidata和DBpedia)以及从非结构化数据源(如报纸文章和金融信息)提取和组合信息来生成这个特定的图。此外，该系统应包含专有数据源，例如金融交易(结构化)和信用报告(非结构化)。最终目标是创建一个系统，该系统可以识别结构化和非结构化来源中的金融实体，将它们与知识库的信息链接起来，然后提取已识别实体之间用文本表示的关系。构建的知识库可用于构建所需的知识图谱。我们的系统设计由几个组件组成，每个组件都解决一个特定的子问题。为此，图1给出了我们的系统及其子组件的总体概述。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets

自引率

0.00%

发文量