Lucas Bickmann, Sarah Sandmann, Carolin Walter, Julian Varghese
{"title":"PubMed基因/细胞类型关系图谱。","authors":"Lucas Bickmann, Sarah Sandmann, Carolin Walter, Julian Varghese","doi":"10.1186/s12859-025-06236-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Rapid extraction and visualization of cell-specific gene expression is important for automatic cell type annotation, e.g. in single cell analysis. There is an emerging field in which tools such as curated databases or machine learning methods are used to support cell type annotation. However, complementing approaches to efficiently incorporate the latest knowledge of free-text articles from literature databases, such as PubMed, are understudied.</p><p><strong>Results: </strong>This work introduces the PubMed Gene/Cell type-Relation Atlas (PuMA) which provides a local, easy-to-use web-interface to facilitate literature-driven cell type annotation. It utilizes a pretrained machine learning based named entity recognition model in order to extract gene and cell type concepts from PubMed, links biomedical ontologies, and suggests gene to cell type relations based on a ranking score. It includes a search tool for genes and cell types, additionally providing an interactive graph visualization for exploring cross-relations. Each result is fully traceable by linking the relevant PubMed articles.</p><p><strong>Conclusions: </strong>This work enables researchers to analyse and automatize cell type annotation based on PubMed articles. It complements manual curated marker gene databases and enables interactive visualizations. The evaluation shows that PuMA is competitive against an extensive manual curated database across three gold standard datasets and two species-mouse and human. The software framework is freely available and enables regular article imports for incremental knowledge updates.GitLab: https://imigitlab.uni-muenster.de/published/PuMA/.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"201"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12308971/pdf/","citationCount":"0","resultStr":"{\"title\":\"PuMA: PubMed gene/cell type-relation Atlas.\",\"authors\":\"Lucas Bickmann, Sarah Sandmann, Carolin Walter, Julian Varghese\",\"doi\":\"10.1186/s12859-025-06236-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Rapid extraction and visualization of cell-specific gene expression is important for automatic cell type annotation, e.g. in single cell analysis. There is an emerging field in which tools such as curated databases or machine learning methods are used to support cell type annotation. However, complementing approaches to efficiently incorporate the latest knowledge of free-text articles from literature databases, such as PubMed, are understudied.</p><p><strong>Results: </strong>This work introduces the PubMed Gene/Cell type-Relation Atlas (PuMA) which provides a local, easy-to-use web-interface to facilitate literature-driven cell type annotation. It utilizes a pretrained machine learning based named entity recognition model in order to extract gene and cell type concepts from PubMed, links biomedical ontologies, and suggests gene to cell type relations based on a ranking score. It includes a search tool for genes and cell types, additionally providing an interactive graph visualization for exploring cross-relations. Each result is fully traceable by linking the relevant PubMed articles.</p><p><strong>Conclusions: </strong>This work enables researchers to analyse and automatize cell type annotation based on PubMed articles. It complements manual curated marker gene databases and enables interactive visualizations. The evaluation shows that PuMA is competitive against an extensive manual curated database across three gold standard datasets and two species-mouse and human. The software framework is freely available and enables regular article imports for incremental knowledge updates.GitLab: https://imigitlab.uni-muenster.de/published/PuMA/.</p>\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"201\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12308971/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06236-8\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06236-8","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Background: Rapid extraction and visualization of cell-specific gene expression is important for automatic cell type annotation, e.g. in single cell analysis. There is an emerging field in which tools such as curated databases or machine learning methods are used to support cell type annotation. However, complementing approaches to efficiently incorporate the latest knowledge of free-text articles from literature databases, such as PubMed, are understudied.
Results: This work introduces the PubMed Gene/Cell type-Relation Atlas (PuMA) which provides a local, easy-to-use web-interface to facilitate literature-driven cell type annotation. It utilizes a pretrained machine learning based named entity recognition model in order to extract gene and cell type concepts from PubMed, links biomedical ontologies, and suggests gene to cell type relations based on a ranking score. It includes a search tool for genes and cell types, additionally providing an interactive graph visualization for exploring cross-relations. Each result is fully traceable by linking the relevant PubMed articles.
Conclusions: This work enables researchers to analyse and automatize cell type annotation based on PubMed articles. It complements manual curated marker gene databases and enables interactive visualizations. The evaluation shows that PuMA is competitive against an extensive manual curated database across three gold standard datasets and two species-mouse and human. The software framework is freely available and enables regular article imports for incremental knowledge updates.GitLab: https://imigitlab.uni-muenster.de/published/PuMA/.
期刊介绍:
BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology.
BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.