Julián Candia*, Giovanna Fantoni, Francheska Delgado-Peraza, Nader Shehadeh, Toshiko Tanaka, Ruin Moaddel, Keenan A. Walker and Luigi Ferrucci,
{"title":"somammodules:一种针对SomaScan数据的途径富集方法。","authors":"Julián Candia*, Giovanna Fantoni, Francheska Delgado-Peraza, Nader Shehadeh, Toshiko Tanaka, Ruin Moaddel, Keenan A. Walker and Luigi Ferrucci, ","doi":"10.1021/acs.jproteome.4c01114","DOIUrl":null,"url":null,"abstract":"<p >Motivated by the lack of adequate tools to perform pathway enrichment analysis, this work presents an approach specifically tailored to SomaScan data. Starting from annotated gene sets, we developed a greedy, top-down procedure to iteratively identify strongly intracorrelated SOMAmer modules, termed “SomaModules”, based on 11K SomaScan data. We generated two repositories based on the latest MSigDB and MitoCarta releases, containing more than 40,000 SOMAmer-based gene sets combined. These repositories can be utilized by any unstructured pathway enrichment analysis tool. We validated our results with two case examples: (i) Alzheimer’s disease specific pathways in a 7K SomaScan case–control study, and (ii) mitochondrial pathways using 11K SomaScan data linked to physical performance outcomes. Using gene set enrichment analysis (GSEA), we found that, in both examples, SomaModules had significantly higher enrichment than the original gene set counterparts. These findings were robust and not significantly affected by the choice of enrichment metric or the Kolmogorov enrichment statistic used in the GSEA procedure. We provide users with access to all code, documentation and data needed to reproduce our current repositories, which also will enable them to leverage our framework to analyze SomaModules derived from other sources, including custom, user-generated gene sets.</p>","PeriodicalId":48,"journal":{"name":"Journal of Proteome Research","volume":"24 9","pages":"4391–4402"},"PeriodicalIF":3.6000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acs.jproteome.4c01114","citationCount":"0","resultStr":"{\"title\":\"SomaModules: A Pathway Enrichment Approach Tailored to SomaScan Data\",\"authors\":\"Julián Candia*, Giovanna Fantoni, Francheska Delgado-Peraza, Nader Shehadeh, Toshiko Tanaka, Ruin Moaddel, Keenan A. Walker and Luigi Ferrucci, \",\"doi\":\"10.1021/acs.jproteome.4c01114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Motivated by the lack of adequate tools to perform pathway enrichment analysis, this work presents an approach specifically tailored to SomaScan data. Starting from annotated gene sets, we developed a greedy, top-down procedure to iteratively identify strongly intracorrelated SOMAmer modules, termed “SomaModules”, based on 11K SomaScan data. We generated two repositories based on the latest MSigDB and MitoCarta releases, containing more than 40,000 SOMAmer-based gene sets combined. These repositories can be utilized by any unstructured pathway enrichment analysis tool. We validated our results with two case examples: (i) Alzheimer’s disease specific pathways in a 7K SomaScan case–control study, and (ii) mitochondrial pathways using 11K SomaScan data linked to physical performance outcomes. Using gene set enrichment analysis (GSEA), we found that, in both examples, SomaModules had significantly higher enrichment than the original gene set counterparts. These findings were robust and not significantly affected by the choice of enrichment metric or the Kolmogorov enrichment statistic used in the GSEA procedure. We provide users with access to all code, documentation and data needed to reproduce our current repositories, which also will enable them to leverage our framework to analyze SomaModules derived from other sources, including custom, user-generated gene sets.</p>\",\"PeriodicalId\":48,\"journal\":{\"name\":\"Journal of Proteome Research\",\"volume\":\"24 9\",\"pages\":\"4391–4402\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/pdf/10.1021/acs.jproteome.4c01114\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Proteome Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jproteome.4c01114\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Proteome Research","FirstCategoryId":"99","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jproteome.4c01114","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
SomaModules: A Pathway Enrichment Approach Tailored to SomaScan Data
Motivated by the lack of adequate tools to perform pathway enrichment analysis, this work presents an approach specifically tailored to SomaScan data. Starting from annotated gene sets, we developed a greedy, top-down procedure to iteratively identify strongly intracorrelated SOMAmer modules, termed “SomaModules”, based on 11K SomaScan data. We generated two repositories based on the latest MSigDB and MitoCarta releases, containing more than 40,000 SOMAmer-based gene sets combined. These repositories can be utilized by any unstructured pathway enrichment analysis tool. We validated our results with two case examples: (i) Alzheimer’s disease specific pathways in a 7K SomaScan case–control study, and (ii) mitochondrial pathways using 11K SomaScan data linked to physical performance outcomes. Using gene set enrichment analysis (GSEA), we found that, in both examples, SomaModules had significantly higher enrichment than the original gene set counterparts. These findings were robust and not significantly affected by the choice of enrichment metric or the Kolmogorov enrichment statistic used in the GSEA procedure. We provide users with access to all code, documentation and data needed to reproduce our current repositories, which also will enable them to leverage our framework to analyze SomaModules derived from other sources, including custom, user-generated gene sets.
期刊介绍:
Journal of Proteome Research publishes content encompassing all aspects of global protein analysis and function, including the dynamic aspects of genomics, spatio-temporal proteomics, metabonomics and metabolomics, clinical and agricultural proteomics, as well as advances in methodology including bioinformatics. The theme and emphasis is on a multidisciplinary approach to the life sciences through the synergy between the different types of "omics".