Graph- autofs:图神经网络中的自动特征选择

Siyu Xiong, Rengyang Liu, Chao Yi
{"title":"Graph- autofs:图神经网络中的自动特征选择","authors":"Siyu Xiong, Rengyang Liu, Chao Yi","doi":"10.1145/3456172.3456191","DOIUrl":null,"url":null,"abstract":"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.","PeriodicalId":133908,"journal":{"name":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph-AutoFS: Auto Feature Selection in Graph Neural\",\"authors\":\"Siyu Xiong, Rengyang Liu, Chao Yi\",\"doi\":\"10.1145/3456172.3456191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.\",\"PeriodicalId\":133908,\"journal\":{\"name\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3456172.3456191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456172.3456191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

图嵌入是在低维空间中表示图数据用于图分析的一种有效方法。基于图卷积网络(GCN)的启发,研究人员通过学习图拓扑的向量表示和节点特征在该任务中取得了重大进展。然而,如何在真实数据中选择节点的特征是一个挑战。传统的单热编码或节点描述等特征选择方法存在较大的内存和计算成本。更糟糕的是,无用的特征可能会引入噪声并使训练过程复杂化。本文提出了一种两阶段的图自动特征选择算法(graph - autofs)来改进现有的模型。Graph-AutoFS可以自动选择重要特征作为训练输入,计算代价正好等于训练目标模型的收敛性。我们没有在搜索阶段引入离散的候选特征集,而是通过引入体系结构参数将选择放宽为连续的。通过在体系结构参数上实现正则化优化器,该模型可以在模型训练过程中自动识别和删除冗余特征。在重新训练阶段,我们将架构参数作为一个关注单元来提高性能。我们使用三个公开的基准数据集和两种流行的图嵌入方法进行实验,验证了graph - autofs在节点聚类和链接预测任务中的性能。实验结果表明,在这些任务上,graph - autofs始终优于原始图嵌入方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Graph-AutoFS: Auto Feature Selection in Graph Neural
Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信