Alberto Cattaneo, Stephen Bonner, Thomas Martynec, Carlo Luschi, Ian P Barrett, Daniel Justus
{"title":"The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models","authors":"Alberto Cattaneo, Stephen Bonner, Thomas Martynec, Carlo Luschi, Ian P Barrett, Daniel Justus","doi":"arxiv-2409.04103","DOIUrl":null,"url":null,"abstract":"Knowledge Graph Completion has been increasingly adopted as a useful method\nfor several tasks in biomedical research, like drug repurposing or drug-target\nidentification. To that end, a variety of datasets and Knowledge Graph\nEmbedding models has been proposed over the years. However, little is known\nabout the properties that render a dataset useful for a given task and, even\nthough theoretical properties of Knowledge Graph Embedding models are well\nunderstood, their practical utility in this field remains controversial. We\nconduct a comprehensive investigation into the topological properties of\npublicly available biomedical Knowledge Graphs and establish links to the\naccuracy observed in real-world applications. By releasing all model\npredictions and a new suite of analysis tools we invite the community to build\nupon our work and continue improving the understanding of these crucial\napplications.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge Graph Completion has been increasingly adopted as a useful method
for several tasks in biomedical research, like drug repurposing or drug-target
identification. To that end, a variety of datasets and Knowledge Graph
Embedding models has been proposed over the years. However, little is known
about the properties that render a dataset useful for a given task and, even
though theoretical properties of Knowledge Graph Embedding models are well
understood, their practical utility in this field remains controversial. We
conduct a comprehensive investigation into the topological properties of
publicly available biomedical Knowledge Graphs and establish links to the
accuracy observed in real-world applications. By releasing all model
predictions and a new suite of analysis tools we invite the community to build
upon our work and continue improving the understanding of these crucial
applications.