{"title":"基于多效用阈值的大规模图数据高效分布式挖掘算法研究","authors":"Qi Meibin","doi":"10.1109/ICSCDE54196.2021.00065","DOIUrl":null,"url":null,"abstract":"In graph data mining, setting a single threshold on the data set is easy to cause the loss of rare item sets, resulting in excessive memory consumption of the algorithm. This paper proposes an efficient distributed mining algorithm for large-scale graph data based on multi utility threshold. In the form of undirected label, the frequent item sets of large-scale graph data are constructed, and the graph data item sets are sorted according to the minimum utility average value, and the useless items are deleted. Design a graph data utility list to store the necessary item set information. The pruning strategy is set based on multi utility threshold, and the upper utility limit of transaction is adjusted to reduce the search space. Design the distributed mining algorithm of graph data, traverse the search space, and achieve the balance of computing load. The experimental results show that in the same data set test, the memory peak of the proposed algorithm is lower than FHM algorithm, HAUIM-MMAU algorithm and FP-Storm algorithm. Therefore, the algorithm designed in this paper can reduce the occupation of memory, reduce memory consumption, and has a certain application value.","PeriodicalId":208108,"journal":{"name":"2021 International Conference of Social Computing and Digital Economy (ICSCDE)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Efficient Distributed Mining Algorithm of Large Scale Graph Data Based on Multi Utility Threshold\",\"authors\":\"Qi Meibin\",\"doi\":\"10.1109/ICSCDE54196.2021.00065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In graph data mining, setting a single threshold on the data set is easy to cause the loss of rare item sets, resulting in excessive memory consumption of the algorithm. This paper proposes an efficient distributed mining algorithm for large-scale graph data based on multi utility threshold. In the form of undirected label, the frequent item sets of large-scale graph data are constructed, and the graph data item sets are sorted according to the minimum utility average value, and the useless items are deleted. Design a graph data utility list to store the necessary item set information. The pruning strategy is set based on multi utility threshold, and the upper utility limit of transaction is adjusted to reduce the search space. Design the distributed mining algorithm of graph data, traverse the search space, and achieve the balance of computing load. The experimental results show that in the same data set test, the memory peak of the proposed algorithm is lower than FHM algorithm, HAUIM-MMAU algorithm and FP-Storm algorithm. Therefore, the algorithm designed in this paper can reduce the occupation of memory, reduce memory consumption, and has a certain application value.\",\"PeriodicalId\":208108,\"journal\":{\"name\":\"2021 International Conference of Social Computing and Digital Economy (ICSCDE)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference of Social Computing and Digital Economy (ICSCDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSCDE54196.2021.00065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference of Social Computing and Digital Economy (ICSCDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCDE54196.2021.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Efficient Distributed Mining Algorithm of Large Scale Graph Data Based on Multi Utility Threshold
In graph data mining, setting a single threshold on the data set is easy to cause the loss of rare item sets, resulting in excessive memory consumption of the algorithm. This paper proposes an efficient distributed mining algorithm for large-scale graph data based on multi utility threshold. In the form of undirected label, the frequent item sets of large-scale graph data are constructed, and the graph data item sets are sorted according to the minimum utility average value, and the useless items are deleted. Design a graph data utility list to store the necessary item set information. The pruning strategy is set based on multi utility threshold, and the upper utility limit of transaction is adjusted to reduce the search space. Design the distributed mining algorithm of graph data, traverse the search space, and achieve the balance of computing load. The experimental results show that in the same data set test, the memory peak of the proposed algorithm is lower than FHM algorithm, HAUIM-MMAU algorithm and FP-Storm algorithm. Therefore, the algorithm designed in this paper can reduce the occupation of memory, reduce memory consumption, and has a certain application value.