Dima Kagan, Juman Jubran, Esti Yeger-Lotem, Michael Fire
{"title":"基于网络的异常检测算法揭示了在人体组织中起重要作用的蛋白质。","authors":"Dima Kagan, Juman Jubran, Esti Yeger-Lotem, Michael Fire","doi":"10.1093/gigascience/giaf034","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Proteins act through physical interactions with other molecules to maintain organismal health. Protein-protein interaction (PPI) networks have proved to be a powerful framework for obtaining insight into protein functions, cellular organization, response to signals, and disease states. In multicellular organisms, protein content varies between tissues, influencing tissue morphology and function. Weighted PPI networks, reflecting the likelihood of interactions in specific tissues, offer insights into tissue-specific processes and disease mechanisms. We hypothesized that detecting anomalous nodes in these networks could reveal proteins with key tissue-specific functions.</p><p><strong>Results: </strong>Here, we introduce Weighted Graph Anomalous Node Detection (WGAND), a novel machine-learning algorithm to identify anomalous nodes in weighted graphs. WGAND estimates expected edge weights and uses deviations to generate anomaly detection features, which are then used to score network nodes. We applied WGAND to weighted PPI networks of 17 human tissues. High-ranking anomalous nodes were enriched for proteins associated with tissue-specific diseases and tissue-specific biological processes, such as neuron signaling in the brain and spermatogenesis in the testis. WGAND outperformed other methods in terms of area under the ROC curve and precision at K, highlighting its effectiveness in uncovering biologically meaningful anomalies.</p><p><strong>Conclusions: </strong>Our findings demonstrate WGAND's potential as a powerful tool for detecting anomalous proteins with significant biological roles. By identifying proteins involved in critical tissue-specific processes and diseases, WGAND offers valuable insights for discovering novel biomarkers and therapeutic targets. Its versatile algorithm is suitable for any weighted graph and is broadly applicable across various fields. The WGAND algorithm is available as an open-source Python library at https://github.com/data4goodlab/wgand.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11976396/pdf/","citationCount":"0","resultStr":"{\"title\":\"Network-based anomaly detection algorithm reveals proteins with major roles in human tissues.\",\"authors\":\"Dima Kagan, Juman Jubran, Esti Yeger-Lotem, Michael Fire\",\"doi\":\"10.1093/gigascience/giaf034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Proteins act through physical interactions with other molecules to maintain organismal health. Protein-protein interaction (PPI) networks have proved to be a powerful framework for obtaining insight into protein functions, cellular organization, response to signals, and disease states. In multicellular organisms, protein content varies between tissues, influencing tissue morphology and function. Weighted PPI networks, reflecting the likelihood of interactions in specific tissues, offer insights into tissue-specific processes and disease mechanisms. We hypothesized that detecting anomalous nodes in these networks could reveal proteins with key tissue-specific functions.</p><p><strong>Results: </strong>Here, we introduce Weighted Graph Anomalous Node Detection (WGAND), a novel machine-learning algorithm to identify anomalous nodes in weighted graphs. WGAND estimates expected edge weights and uses deviations to generate anomaly detection features, which are then used to score network nodes. We applied WGAND to weighted PPI networks of 17 human tissues. High-ranking anomalous nodes were enriched for proteins associated with tissue-specific diseases and tissue-specific biological processes, such as neuron signaling in the brain and spermatogenesis in the testis. WGAND outperformed other methods in terms of area under the ROC curve and precision at K, highlighting its effectiveness in uncovering biologically meaningful anomalies.</p><p><strong>Conclusions: </strong>Our findings demonstrate WGAND's potential as a powerful tool for detecting anomalous proteins with significant biological roles. By identifying proteins involved in critical tissue-specific processes and diseases, WGAND offers valuable insights for discovering novel biomarkers and therapeutic targets. Its versatile algorithm is suitable for any weighted graph and is broadly applicable across various fields. The WGAND algorithm is available as an open-source Python library at https://github.com/data4goodlab/wgand.</p>\",\"PeriodicalId\":12581,\"journal\":{\"name\":\"GigaScience\",\"volume\":\"14 \",\"pages\":\"\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11976396/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GigaScience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gigascience/giaf034\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giaf034","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Network-based anomaly detection algorithm reveals proteins with major roles in human tissues.
Background: Proteins act through physical interactions with other molecules to maintain organismal health. Protein-protein interaction (PPI) networks have proved to be a powerful framework for obtaining insight into protein functions, cellular organization, response to signals, and disease states. In multicellular organisms, protein content varies between tissues, influencing tissue morphology and function. Weighted PPI networks, reflecting the likelihood of interactions in specific tissues, offer insights into tissue-specific processes and disease mechanisms. We hypothesized that detecting anomalous nodes in these networks could reveal proteins with key tissue-specific functions.
Results: Here, we introduce Weighted Graph Anomalous Node Detection (WGAND), a novel machine-learning algorithm to identify anomalous nodes in weighted graphs. WGAND estimates expected edge weights and uses deviations to generate anomaly detection features, which are then used to score network nodes. We applied WGAND to weighted PPI networks of 17 human tissues. High-ranking anomalous nodes were enriched for proteins associated with tissue-specific diseases and tissue-specific biological processes, such as neuron signaling in the brain and spermatogenesis in the testis. WGAND outperformed other methods in terms of area under the ROC curve and precision at K, highlighting its effectiveness in uncovering biologically meaningful anomalies.
Conclusions: Our findings demonstrate WGAND's potential as a powerful tool for detecting anomalous proteins with significant biological roles. By identifying proteins involved in critical tissue-specific processes and diseases, WGAND offers valuable insights for discovering novel biomarkers and therapeutic targets. Its versatile algorithm is suitable for any weighted graph and is broadly applicable across various fields. The WGAND algorithm is available as an open-source Python library at https://github.com/data4goodlab/wgand.
期刊介绍:
GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.