Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan
{"title":"基于图神经网络的可伸缩缺失数据输入","authors":"Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan","doi":"10.1109/ICASSPW59220.2023.10193535","DOIUrl":null,"url":null,"abstract":"Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Missing Data Imputation With Graph Neural Networks\",\"authors\":\"Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan\",\"doi\":\"10.1109/ICASSPW59220.2023.10193535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.\",\"PeriodicalId\":158726,\"journal\":{\"name\":\"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSPW59220.2023.10193535\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSPW59220.2023.10193535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalable Missing Data Imputation With Graph Neural Networks
Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.