{"title":"Removal mechanism of redundant blank nodes in linked data","authors":"Lu Yang, Li Huang, Haichuan Lu, Fangfang Xu","doi":"10.1109/ICIEA.2018.8397878","DOIUrl":null,"url":null,"abstract":"In the development of the semantic web, blank nodes (also called anonymous nodes or anonymous resources) are a significant factor in the data redundancy. Blank nodes are RDF nodes of the graphs which are not URI identifies. And they are convenient for those resources which are complex and not URI identifies but have property structures. It is right because blank nodes have no URI identifies that different people may create different blank nodes for the same anonymous resources which caused the huge information redundancy. A method is proposed in this paper, first, according to the features of blank nodes, detect the blank nodes, and then, dictionary the triples of the RDF graph, for blank nodes, expressed in negative, which is convenient to query the triples containing blank nodes. Then according to the mining rules of linked data, all the S-Models (The triples set which uses as subject), O-Model (The triples set which use o as object) and B-Blanks (Blank node collection) of the RDF graph can be constructed. Traverse the B — Blanks collection, and remove the redundancy of SB-Model (The triples set which use blank node b as subject) and OB-Model (The triples set which use blank node b as object). Experimental results show that the proposed blank node detection method is very efficient. And the efficiency of compression and storage is improved based on the processed RDF file. Experiments show that the use of dictionary-based correlation data to detect and remove the blank node greatly improves the operating efficiency. In this paper, the detection of the blank nodes based on the correlation data is only based on the representation of the blank nodes in the triplet. It can not detect the blank nodes for the data in the common RDF chart format. The blank nodes based on the correlation data For now, only for the simple RDF data model, the SOBM algorithm proposed in this paper can not remove the blank nodes well. So the next work mainly includes: (1) Perfecting the method of detecting blank nodes to better support RDF data format of various structures; (2) Perfecting the algorithm of removing blank nodes to make it better to deal with Blank node chain problem.","PeriodicalId":140420,"journal":{"name":"2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIEA.2018.8397878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In the development of the semantic web, blank nodes (also called anonymous nodes or anonymous resources) are a significant factor in the data redundancy. Blank nodes are RDF nodes of the graphs which are not URI identifies. And they are convenient for those resources which are complex and not URI identifies but have property structures. It is right because blank nodes have no URI identifies that different people may create different blank nodes for the same anonymous resources which caused the huge information redundancy. A method is proposed in this paper, first, according to the features of blank nodes, detect the blank nodes, and then, dictionary the triples of the RDF graph, for blank nodes, expressed in negative, which is convenient to query the triples containing blank nodes. Then according to the mining rules of linked data, all the S-Models (The triples set which uses as subject), O-Model (The triples set which use o as object) and B-Blanks (Blank node collection) of the RDF graph can be constructed. Traverse the B — Blanks collection, and remove the redundancy of SB-Model (The triples set which use blank node b as subject) and OB-Model (The triples set which use blank node b as object). Experimental results show that the proposed blank node detection method is very efficient. And the efficiency of compression and storage is improved based on the processed RDF file. Experiments show that the use of dictionary-based correlation data to detect and remove the blank node greatly improves the operating efficiency. In this paper, the detection of the blank nodes based on the correlation data is only based on the representation of the blank nodes in the triplet. It can not detect the blank nodes for the data in the common RDF chart format. The blank nodes based on the correlation data For now, only for the simple RDF data model, the SOBM algorithm proposed in this paper can not remove the blank nodes well. So the next work mainly includes: (1) Perfecting the method of detecting blank nodes to better support RDF data format of various structures; (2) Perfecting the algorithm of removing blank nodes to make it better to deal with Blank node chain problem.