{"title":"Graph- autofs:图神经网络中的自动特征选择","authors":"Siyu Xiong, Rengyang Liu, Chao Yi","doi":"10.1145/3456172.3456191","DOIUrl":null,"url":null,"abstract":"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.","PeriodicalId":133908,"journal":{"name":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph-AutoFS: Auto Feature Selection in Graph Neural\",\"authors\":\"Siyu Xiong, Rengyang Liu, Chao Yi\",\"doi\":\"10.1145/3456172.3456191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.\",\"PeriodicalId\":133908,\"journal\":{\"name\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3456172.3456191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456172.3456191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Graph-AutoFS: Auto Feature Selection in Graph Neural
Graph embedding is an effective method to represent graph data in low-dimensional space for graph analysis. Based on the inspiration of graph convolutional network (GCN), researchers have made significant progress by learning the vector representation of graph topology and node features in this task. However, it is challenging to select the feature of nodes in the real-world data. Traditional feature selection method like one-hot encoding or description of the node brings large memory and computation cost. Even worse, useless features may introduce noise and complicate the training process. In this paper, we propose a two-stage algorithm for graph automatic feature selection (Graph-AutoFS) to improve existing models. Graph-AutoFS can automatically select important features as training inputs, and the computational cost is precisely equal to the convergence of the training target model. We did not introduce discrete candidate feature sets in the search stage but relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer on the architecture parameters, the model can automatically identify and delete redundant features during the model's training process. In the re-train phase, we keep the architecture parameters serving as an attention unit to boost the performance. We use three public benchmark data sets and two popular graph embedding methods to conduct experiments to verify the performance of Graph-AutoFS in node clustering and link prediction tasks. Experimental results show that Graph-AutoFS consistently outperforms original graph embedding methods considerably on these tasks.