{"title":"Rethinking Graph Classification Problem in Presence of Isomorphism","authors":"S. Ivanov, S. Sviridov, E. Burnaev","doi":"10.1134/S1064562424602385","DOIUrl":null,"url":null,"abstract":"<p>There is an increasing interest in developing new models for graph classification problem that serves as a common benchmark for evaluation and comparison of GNNs and graph kernels. To ensure a fair comparison of the models several commonly used datasets exist and current assessments and conclusions rely on the validity of these datasets. However, as we show in this paper majority of these datasets contain isomorphic copies of the data points, which can lead to misleading conclusions. For example, the relative ranking of the graph models can change substantially if we remove isomorphic graphs in the test set.</p><p>To mitigate this we present several results. We show that explicitly incorporating the knowledge of isomorphism in the datasets can significantly boost the performance of any graph model. Finally, we re-evaluate commonly used graph models on refined graph datasets and provide recommendations for designing new datasets and metrics for graph classification problem.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S312 - S331"},"PeriodicalIF":0.5000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1134/S1064562424602385.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602385","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
There is an increasing interest in developing new models for graph classification problem that serves as a common benchmark for evaluation and comparison of GNNs and graph kernels. To ensure a fair comparison of the models several commonly used datasets exist and current assessments and conclusions rely on the validity of these datasets. However, as we show in this paper majority of these datasets contain isomorphic copies of the data points, which can lead to misleading conclusions. For example, the relative ranking of the graph models can change substantially if we remove isomorphic graphs in the test set.
To mitigate this we present several results. We show that explicitly incorporating the knowledge of isomorphism in the datasets can significantly boost the performance of any graph model. Finally, we re-evaluate commonly used graph models on refined graph datasets and provide recommendations for designing new datasets and metrics for graph classification problem.
期刊介绍:
Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.