{"title":"一种基于标签的链路模式划分策略","authors":"Cuifang Zhao, Xiang Zhang, Peng Wang","doi":"10.1109/KICSS.2012.15","DOIUrl":null,"url":null,"abstract":"As the explosive growth of online linked data, the task of mining link patterns attracts more and more attention. A practical issue is how to perform mining efficiently in large-scale linked data. Existing pattern mining algorithms usually assume that the dataset can fit into the main memory, while linked data of billion triples is far beyond the memory limitation. In this paper we give a pilot study of a novel partitioning strategy for mining link patterns in large-scale linked data. First, we propose an algorithm named Par Group to divide and group large linked data to partitions based on vertex label, Second, an adapted gSpan is applied for mining link patterns in each partition, At last, discovered link patterns are merged into a global result set. Experiments show that our strategy is feasible and promising in some scenarios.","PeriodicalId":309736,"journal":{"name":"2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Label-Based Partitioning Strategy for Mining Link Patterns\",\"authors\":\"Cuifang Zhao, Xiang Zhang, Peng Wang\",\"doi\":\"10.1109/KICSS.2012.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the explosive growth of online linked data, the task of mining link patterns attracts more and more attention. A practical issue is how to perform mining efficiently in large-scale linked data. Existing pattern mining algorithms usually assume that the dataset can fit into the main memory, while linked data of billion triples is far beyond the memory limitation. In this paper we give a pilot study of a novel partitioning strategy for mining link patterns in large-scale linked data. First, we propose an algorithm named Par Group to divide and group large linked data to partitions based on vertex label, Second, an adapted gSpan is applied for mining link patterns in each partition, At last, discovered link patterns are merged into a global result set. Experiments show that our strategy is feasible and promising in some scenarios.\",\"PeriodicalId\":309736,\"journal\":{\"name\":\"2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KICSS.2012.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KICSS.2012.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Label-Based Partitioning Strategy for Mining Link Patterns
As the explosive growth of online linked data, the task of mining link patterns attracts more and more attention. A practical issue is how to perform mining efficiently in large-scale linked data. Existing pattern mining algorithms usually assume that the dataset can fit into the main memory, while linked data of billion triples is far beyond the memory limitation. In this paper we give a pilot study of a novel partitioning strategy for mining link patterns in large-scale linked data. First, we propose an algorithm named Par Group to divide and group large linked data to partitions based on vertex label, Second, an adapted gSpan is applied for mining link patterns in each partition, At last, discovered link patterns are merged into a global result set. Experiments show that our strategy is feasible and promising in some scenarios.