GeneSetCart:组装,增强,结合,可视化,并分析基因集。

IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES
Giacomo B Marino, Stephanie Olaiya, John Erol Evangelista, Daniel J B Clarke, Avi Ma'ayan
{"title":"GeneSetCart:组装,增强,结合,可视化,并分析基因集。","authors":"Giacomo B Marino, Stephanie Olaiya, John Erol Evangelista, Daniel J B Clarke, Avi Ma'ayan","doi":"10.1093/gigascience/giaf025","DOIUrl":null,"url":null,"abstract":"<p><p>Converting multiomics datasets into gene sets facilitates data integration that leads to knowledge discovery. Although there are tools developed to analyze gene sets, only a few offer the management of gene sets from multiple sources. GeneSetCart is an interactive web-based platform that enables investigators to gather gene sets from various sources; augment these sets with gene-gene coexpression correlations and protein-protein interactions; perform set operations on these sets such as union, consensus, and intersection; and visualize and analyze these gene sets, all in one place. GeneSetCart supports the upload of single or multiple gene sets, as well as fetching gene sets by searching PubMed for genes comentioned with terms in publications. Venn diagrams, heatmaps, Uniform Manifold Approximation and Projection (UMAP) plots, SuperVenn diagrams, and UpSet plots can visualize the gene sets in a GeneSetCart session to summarize the similarity and overlap among the sets. Users of GeneSetCart can also perform enrichment analysis on their assembled gene sets with external tools. All gene sets in a session can be saved to a user account for reanalysis and sharing with collaborators. GeneSetCart has a gene set library crossing feature that enables analysis of gene sets created from several National Institutes of Health Common Fund programs. For the top overlapping sets from pairs of programs, a large language model (LLM) is prompted to propose possible reasons for the high overlap. Using this feature, two use cases are presented. In addition, users of GeneSetCart can produce publication-ready reports from their uploaded sets. Text in these reports is also supplemented with an LLM. Overall, GeneSetCart is a useful resource enabling biologists without programming expertise to facilitate data integration for hypothesis generation.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984350/pdf/","citationCount":"0","resultStr":"{\"title\":\"GeneSetCart: assembling, augmenting, combining, visualizing, and analyzing gene sets.\",\"authors\":\"Giacomo B Marino, Stephanie Olaiya, John Erol Evangelista, Daniel J B Clarke, Avi Ma'ayan\",\"doi\":\"10.1093/gigascience/giaf025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Converting multiomics datasets into gene sets facilitates data integration that leads to knowledge discovery. Although there are tools developed to analyze gene sets, only a few offer the management of gene sets from multiple sources. GeneSetCart is an interactive web-based platform that enables investigators to gather gene sets from various sources; augment these sets with gene-gene coexpression correlations and protein-protein interactions; perform set operations on these sets such as union, consensus, and intersection; and visualize and analyze these gene sets, all in one place. GeneSetCart supports the upload of single or multiple gene sets, as well as fetching gene sets by searching PubMed for genes comentioned with terms in publications. Venn diagrams, heatmaps, Uniform Manifold Approximation and Projection (UMAP) plots, SuperVenn diagrams, and UpSet plots can visualize the gene sets in a GeneSetCart session to summarize the similarity and overlap among the sets. Users of GeneSetCart can also perform enrichment analysis on their assembled gene sets with external tools. All gene sets in a session can be saved to a user account for reanalysis and sharing with collaborators. GeneSetCart has a gene set library crossing feature that enables analysis of gene sets created from several National Institutes of Health Common Fund programs. For the top overlapping sets from pairs of programs, a large language model (LLM) is prompted to propose possible reasons for the high overlap. Using this feature, two use cases are presented. In addition, users of GeneSetCart can produce publication-ready reports from their uploaded sets. Text in these reports is also supplemented with an LLM. Overall, GeneSetCart is a useful resource enabling biologists without programming expertise to facilitate data integration for hypothesis generation.</p>\",\"PeriodicalId\":12581,\"journal\":{\"name\":\"GigaScience\",\"volume\":\"14 \",\"pages\":\"\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984350/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GigaScience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gigascience/giaf025\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giaf025","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

将多组学数据集转换为基因集有助于数据集成,从而导致知识发现。虽然已经开发了一些工具来分析基因集,但只有少数工具能够管理来自多个来源的基因集。GeneSetCart是一个基于网络的交互式平台,使研究人员能够从各种来源收集基因集;通过基因-基因共表达相关性和蛋白质-蛋白质相互作用增强这些集;对这些集合执行集合运算,如并、一致、交集;将这些基因组合可视化并进行分析,全部集中在一个地方。GeneSetCart支持上传单个或多个基因集,也可以通过在PubMed中搜索出版物中提到的基因来获取基因集。维恩图、热图、均匀流形近似和投影(UMAP)图、SuperVenn图和UpSet图可以可视化GeneSetCart会话中的基因集,以总结集合之间的相似性和重叠。GeneSetCart的用户还可以使用外部工具对其组装的基因集进行富集分析。会话中的所有基因集都可以保存到用户帐户中,以便重新分析并与合作者共享。GeneSetCart有一个基因集库交叉功能,可以分析来自几个国家卫生研究院共同基金项目的基因集。对于程序对的顶部重叠集,提示大型语言模型(LLM)提出高重叠的可能原因。使用这个特性,给出了两个用例。此外,GeneSetCart的用户可以从他们上传的数据集中生成可供发表的报告。这些报告中的文本还补充了法学硕士学位。总的来说,GeneSetCart是一个有用的资源,使没有编程专业知识的生物学家能够促进数据集成以生成假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GeneSetCart: assembling, augmenting, combining, visualizing, and analyzing gene sets.

Converting multiomics datasets into gene sets facilitates data integration that leads to knowledge discovery. Although there are tools developed to analyze gene sets, only a few offer the management of gene sets from multiple sources. GeneSetCart is an interactive web-based platform that enables investigators to gather gene sets from various sources; augment these sets with gene-gene coexpression correlations and protein-protein interactions; perform set operations on these sets such as union, consensus, and intersection; and visualize and analyze these gene sets, all in one place. GeneSetCart supports the upload of single or multiple gene sets, as well as fetching gene sets by searching PubMed for genes comentioned with terms in publications. Venn diagrams, heatmaps, Uniform Manifold Approximation and Projection (UMAP) plots, SuperVenn diagrams, and UpSet plots can visualize the gene sets in a GeneSetCart session to summarize the similarity and overlap among the sets. Users of GeneSetCart can also perform enrichment analysis on their assembled gene sets with external tools. All gene sets in a session can be saved to a user account for reanalysis and sharing with collaborators. GeneSetCart has a gene set library crossing feature that enables analysis of gene sets created from several National Institutes of Health Common Fund programs. For the top overlapping sets from pairs of programs, a large language model (LLM) is prompted to propose possible reasons for the high overlap. Using this feature, two use cases are presented. In addition, users of GeneSetCart can produce publication-ready reports from their uploaded sets. Text in these reports is also supplemented with an LLM. Overall, GeneSetCart is a useful resource enabling biologists without programming expertise to facilitate data integration for hypothesis generation.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
GigaScience
GigaScience MULTIDISCIPLINARY SCIENCES-
CiteScore
15.50
自引率
1.10%
发文量
119
审稿时长
1 weeks
期刊介绍: GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信