Methods of entity resolution in dataspaces

International Conference on Images, Signals, and Computing Pub Date : 2023-08-21 DOI:10.1117/12.2692046

Yuelin Jia, Wei Lu, Chang Su

引用次数: 0

Abstract

Dataspace is a new way of data integration. Entity resolution identifies two records that point to the same entity in the real world. In this paper, a record graph is constructed by using the records in the data set. The redundant comparisons are removed by pruning the record graph, and the records is divided into blocks according to the pruned graph. The subsequent entity resolution work is only carried out in blocks. When the entity is parsed in the block, the method of attribute mapping and expression representing attribute value is used to further divide the data to ensure the accuracy of parsing. Methods experiments were carried out on real data sets.

查看原文本刊更多论文

数据空间中实体解析的方法

数据空间是一种新的数据集成方式。实体解析识别现实世界中指向同一实体的两条记录。本文利用数据集中的记录构造记录图。通过对记录图进行剪枝去除冗余的比较，并根据剪枝后的图将记录划分为块。随后的实体解析工作仅在块中执行。在块中对实体进行解析时，采用属性映射和表示属性值的表达式的方法对数据进行进一步划分，保证了解析的准确性。方法在实际数据集上进行实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Images, Signals, and Computing

自引率

0.00%

发文量