基于图神经网络的可伸缩缺失数据输入

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) Pub Date : 2023-06-04 DOI:10.1109/ICASSPW59220.2023.10193535

Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan

{"title":"基于图神经网络的可伸缩缺失数据输入","authors":"Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan","doi":"10.1109/ICASSPW59220.2023.10193535","DOIUrl":null,"url":null,"abstract":"Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Missing Data Imputation With Graph Neural Networks\",\"authors\":\"Guillaume Lachaud, Patricia Conde Céspedes, M. Trocan\",\"doi\":\"10.1109/ICASSPW59220.2023.10193535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.\",\"PeriodicalId\":158726,\"journal\":{\"name\":\"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSPW59220.2023.10193535\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSPW59220.2023.10193535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

表格和图形结构化数据中缺失的功能很常见:公司可能不想披露所有的会计信息，在线用户并不总是以与同龄人相同的方式参与社交平台。近年来，葡萄结构等模型在特征输入方面取得了较好的效果。我们提出了GRAPE的扩展，它对不适合GPU的数据集执行小批量学习。此外，我们还添加了预处理和后处理步骤，使该模型能够与图结构数据一起使用。我们通过实验展示了该模型在不同缺失数据制度下在学术引文网络上的行为。我们观察到，当我们观察到的边缘少于1%时，模型性能开始下降。我们还对模型的关键要素进行了消融研究，例如其容量、批量大小和层数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable Missing Data Imputation With Graph Neural Networks

Missing features in tabular and graph structured data are common: a company may not want to disclose all of their accounting, and users online do not always engage in social platforms in the same way as their peers. Recently, models such as the GRAPE architecture have achieved state-of-the-art results in the task of feature imputation. We present an extension of GRAPE that performs mini-batch learning on datasets which do not fit in the GPU. Moreover, we add preprocessing and post-processing steps that allow the model to be used with graph structured data. We experimentally show the behaviour of the model on an academic citation network under different regimes of missing data. We observe that the model performance starts decreasing when we have less than 1% of observed edges. We additionally perform an ablation study of key elements of the model, such as its capacity, the batch size and the number of layers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)

自引率

0.00%

发文量