Towards an automated record linkage process for datasource-independent company matching

Felix Kruse, Christoph Schröer, Jan-Philipp Awick, Jan Reinkensmeier, J. Gómez
{"title":"Towards an automated record linkage process for datasource-independent company matching","authors":"Felix Kruse, Christoph Schröer, Jan-Philipp Awick, Jan Reinkensmeier, J. Gómez","doi":"10.1109/NextComp55567.2022.9932178","DOIUrl":null,"url":null,"abstract":"Record linkage (RL) is becoming increasingly important for companies to integrate data silos and create a more qualitative information base for decision-making. Despite state-of-the-art research results in RL, these are not used in companies because the manual effort is high, and the necessary know-how is lacking. This research-in-progress paper aims to show how a generic RL process for company matching can be developed and how the manual effort can be reduced. For this purpose, our data-driven inductive research is based on insights of an extensive fundament of 18 company relevant data sources. In this research-in-progress paper, we implemented a first version of our generic RL process for company matching. This was applied in three experiments consisting of different data sources. The results show a Precision range of 0.88 to 0.98 and a Recall range of 0.9 - 0.99. These results are promising and show that the development of a generic RL process for company matching is possible. The generic RL process for company matching would majorly impact companies by making it more efficient to integrate new and previously unused data sources.","PeriodicalId":422085,"journal":{"name":"2022 3rd International Conference on Next Generation Computing Applications (NextComp)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Next Generation Computing Applications (NextComp)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NextComp55567.2022.9932178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Record linkage (RL) is becoming increasingly important for companies to integrate data silos and create a more qualitative information base for decision-making. Despite state-of-the-art research results in RL, these are not used in companies because the manual effort is high, and the necessary know-how is lacking. This research-in-progress paper aims to show how a generic RL process for company matching can be developed and how the manual effort can be reduced. For this purpose, our data-driven inductive research is based on insights of an extensive fundament of 18 company relevant data sources. In this research-in-progress paper, we implemented a first version of our generic RL process for company matching. This was applied in three experiments consisting of different data sources. The results show a Precision range of 0.88 to 0.98 and a Recall range of 0.9 - 0.99. These results are promising and show that the development of a generic RL process for company matching is possible. The generic RL process for company matching would majorly impact companies by making it more efficient to integrate new and previously unused data sources.
迈向一个自动化的记录链接过程,用于独立于数据源的公司匹配
记录链接(RL)对于公司整合数据孤岛和为决策创建更定性的信息库变得越来越重要。尽管在强化学习中有最先进的研究成果,但这些成果并没有在公司中使用,因为人工工作量很大,而且缺乏必要的专业知识。这篇正在进行中的研究论文旨在展示如何开发用于公司匹配的通用强化学习过程,以及如何减少手工工作。为此,我们的数据驱动的归纳研究是基于对18个公司相关数据源的广泛基础的见解。在这篇正在进行的研究论文中,我们实现了用于公司匹配的通用强化学习过程的第一个版本。将该方法应用于由不同数据源组成的三个实验中。结果表明,精密度范围为0.88 ~ 0.98,召回范围为0.9 ~ 0.99。这些结果是有希望的,并且表明开发用于公司匹配的通用强化学习过程是可能的。用于公司匹配的通用RL流程将通过更有效地集成新的和以前未使用的数据源来对公司产生重大影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信