Graph- autofs:图神经网络中的自动特征选择

Proceedings of the 2021 7th International Conference on Computing and Data Engineering Pub Date : 2021-01-15 DOI:10.1145/3456172.3456191

Siyu Xiong, Rengyang Liu, Chao Yi

{"title":"Graph- autofs:图神经网络中的自动特征选择","authors":"Siyu Xiong, Rengyang Liu, Chao Yi","doi":"10.1145/3456172.3456191","DOIUrl":null,"url":null,"abstract":"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.","PeriodicalId":133908,"journal":{"name":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph-AutoFS: Auto Feature Selection in Graph Neural\",\"authors\":\"Siyu Xiong, Rengyang Liu, Chao Yi\",\"doi\":\"10.1145/3456172.3456191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.\",\"PeriodicalId\":133908,\"journal\":{\"name\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3456172.3456191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456172.3456191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

图嵌入是在低维空间中表示图数据用于图分析的一种有效方法。基于图卷积网络(GCN)的启发，研究人员通过学习图拓扑的向量表示和节点特征在该任务中取得了重大进展。然而，如何在真实数据中选择节点的特征是一个挑战。传统的单热编码或节点描述等特征选择方法存在较大的内存和计算成本。更糟糕的是，无用的特征可能会引入噪声并使训练过程复杂化。本文提出了一种两阶段的图自动特征选择算法(graph - autofs)来改进现有的模型。Graph-AutoFS可以自动选择重要特征作为训练输入，计算代价正好等于训练目标模型的收敛性。我们没有在搜索阶段引入离散的候选特征集，而是通过引入体系结构参数将选择放宽为连续的。通过在体系结构参数上实现正则化优化器，该模型可以在模型训练过程中自动识别和删除冗余特征。在重新训练阶段，我们将架构参数作为一个关注单元来提高性能。我们使用三个公开的基准数据集和两种流行的图嵌入方法进行实验，验证了graph - autofs在节点聚类和链接预测任务中的性能。实验结果表明，在这些任务上，graph - autofs始终优于原始图嵌入方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Graph-AutoFS: Auto Feature Selection in Graph Neural

Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2021 7th International Conference on Computing and Data Engineering

自引率

0.00%

发文量