Briefings in bioinformatics最新文献

筛选
英文 中文
A comprehensive benchmark study of methods for identifying significantly perturbed subnetworks in cancer. 鉴别癌症中显著扰动子网络的方法的综合基准研究。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae692
Le Yang, Runpu Chen, Steve Goodison, Yijun Sun
{"title":"A comprehensive benchmark study of methods for identifying significantly perturbed subnetworks in cancer.","authors":"Le Yang, Runpu Chen, Steve Goodison, Yijun Sun","doi":"10.1093/bib/bbae692","DOIUrl":"10.1093/bib/bbae692","url":null,"abstract":"<p><p>Network-based methods utilize protein-protein interaction information to identify significantly perturbed subnetworks in cancer and to propose key molecular pathways. Numerous methods have been developed, but to date, a rigorous benchmark analysis to compare the performance of existing approaches is lacking. In this paper, we proposed a novel benchmarking framework using synthetic data and conducted a comprehensive analysis to investigate the ability of existing methods to detect target genes and subnetworks and to control false positives, and how they perform in the presence of topological biases at both gene and subnetwork levels. Our analysis revealed insights into algorithmic performance that were previously unattainable. Based on the results of the benchmark study, we presented a practical guide for users on how to select appropriate detection methods and protein-protein interaction networks for cancer pathway identification, and provided suggestions for future algorithm development.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11684898/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142906223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDCM: a correlation-dependent connectivity map approach to rapidly screen drugs during outbreaks of infectious diseases. CDCM:在传染病暴发期间快速筛选药物的相关性依赖连接图方法。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae659
Junlei Liao, Hongyang Yi, Hao Wang, Sumei Yang, Duanmei Jiang, Xin Huang, Mingxia Zhang, Jiayin Shen, Hongzhou Lu, Yuanling Niu
{"title":"CDCM: a correlation-dependent connectivity map approach to rapidly screen drugs during outbreaks of infectious diseases.","authors":"Junlei Liao, Hongyang Yi, Hao Wang, Sumei Yang, Duanmei Jiang, Xin Huang, Mingxia Zhang, Jiayin Shen, Hongzhou Lu, Yuanling Niu","doi":"10.1093/bib/bbae659","DOIUrl":"10.1093/bib/bbae659","url":null,"abstract":"<p><p>In the context of the global damage caused by coronavirus disease 2019 (COVID-19) and the emergence of the monkeypox virus (MPXV) outbreak as a public health emergency of international concern, research into methods that can rapidly test potential therapeutics during an outbreak of a new infectious disease is urgently needed. Computational drug discovery is an effective way to solve such problems. The existence of various large open databases has mitigated the time and resource consumption of traditional drug development and improved the speed of drug discovery. However, the diversity of cell lines used in various databases remains limited, and previous drug discovery methods are ineffective for cross-cell prediction. In this study, we propose a correlation-dependent connectivity map (CDCM) to achieve cross-cell predictions of drug similarity. The CDCM mainly identifies drug-drug or disease-drug relationships from the perspective of gene networks by exploring the correlation changes between genes and identifying similarities in the effects of drugs or diseases on gene expression. We validated the CDCM on multiple datasets and found that it performed well for drug identification across cell lines. A comparison with the Connectivity Map revealed that our method was more stable and performed better across different cell lines. In the application of the CDCM to COVID-19 and MPXV data, the predictions of potential therapeutic compounds for COVID-19 were consistent with several previous studies, and most of the predicted drugs were found to be experimentally effective against MPXV. This result confirms the practical value of the CDCM. With the ability to predict across cell lines, the CDCM outperforms the Connectivity Map, and it has wider application prospects and a reduced cost of use.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UPicker: a semi-supervised particle picking transformer method for cryo-EM micrographs. UPicker:一种用于低温电镜显微图的半监督粒子拾取变压器方法。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae636
Chi Zhang, Yiran Cheng, Kaiwen Feng, Fa Zhang, Renmin Han, Jieqing Feng
{"title":"UPicker: a semi-supervised particle picking transformer method for cryo-EM micrographs.","authors":"Chi Zhang, Yiran Cheng, Kaiwen Feng, Fa Zhang, Renmin Han, Jieqing Feng","doi":"10.1093/bib/bbae636","DOIUrl":"10.1093/bib/bbae636","url":null,"abstract":"<p><p>Automatic single particle picking is a critical step in the data processing pipeline of cryo-electron microscopy structure reconstruction. In recent years, several deep learning-based algorithms have been developed, demonstrating their potential to solve this challenge. However, current methods highly depend on manually labeled training data, which is labor-intensive and prone to biases especially for high-noise and low-contrast micrographs, resulting in suboptimal precision and recall. To address these problems, we propose UPicker, a semi-supervised transformer-based particle-picking method with a two-stage training process: unsupervised pretraining and supervised fine-tuning. During the unsupervised pretraining, an Adaptive Laplacian of Gaussian region proposal generator is proposed to obtain pseudo-labels from unlabeled data for initial feature learning. For the supervised fine-tuning, UPicker only needs a small amount of labeled data to achieve high accuracy in particle picking. To further enhance model performance, UPicker employs a contrastive denoising training strategy to reduce redundant detections and accelerate convergence, along with a hybrid data augmentation strategy to deal with limited labeled data. Comprehensive experiments on both simulated and experimental datasets demonstrate that UPicker outperforms state-of-the-art particle-picking methods in terms of accuracy and robustness while requiring fewer labeled data than other transformer-based models. Furthermore, ablation studies demonstrate the effectiveness and necessity of each component of UPicker. The source code and data are available at https://github.com/JachyLikeCoding/UPicker.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631311/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142806025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BICEP: Bayesian inference for rare genomic variant causality evaluation in pedigrees. BICEP:血统中罕见基因组变异因果关系评估的贝叶斯推断。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae624
Cathal Ormond, Niamh M Ryan, Mathieu Cap, William Byerley, Aiden Corvin, Elizabeth A Heron
{"title":"BICEP: Bayesian inference for rare genomic variant causality evaluation in pedigrees.","authors":"Cathal Ormond, Niamh M Ryan, Mathieu Cap, William Byerley, Aiden Corvin, Elizabeth A Heron","doi":"10.1093/bib/bbae624","DOIUrl":"10.1093/bib/bbae624","url":null,"abstract":"<p><p>Next-generation sequencing is widely applied to the investigation of pedigree data for gene discovery. However, identifying plausible disease-causing variants within a robust statistical framework is challenging. Here, we introduce BICEP: a Bayesian inference tool for rare variant causality evaluation in pedigree-based cohorts. BICEP calculates the posterior odds that a genomic variant is causal for a phenotype based on the variant cosegregation as well as a priori evidence such as deleteriousness and functional consequence. BICEP can correctly identify causal variants for phenotypes with both Mendelian and complex genetic architectures, outperforming existing methodologies. Additionally, BICEP can correctly down-weight common variants that are unlikely to be involved in phenotypic liability in the context of a pedigree, even if they have reasonable cosegregation patterns. The output metrics from BICEP allow for the quantitative comparison of variant causality within and across pedigrees, which is not possible with existing approaches.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11645550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cell-type deconvolution for bulk RNA-seq data using single-cell reference: a comparative analysis and recommendation guideline.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf031
Xintian Xu, Rui Li, Ouyang Mo, Kai Liu, Justin Li, Pei Hao
{"title":"Cell-type deconvolution for bulk RNA-seq data using single-cell reference: a comparative analysis and recommendation guideline.","authors":"Xintian Xu, Rui Li, Ouyang Mo, Kai Liu, Justin Li, Pei Hao","doi":"10.1093/bib/bbaf031","DOIUrl":"10.1093/bib/bbaf031","url":null,"abstract":"<p><p>The accurate estimation of cell type proportions in tissues is crucial for various downstream analyses. With the increasing availability of single-cell sequencing data, numerous deconvolution methods that use single-cell RNA sequencing data as a reference have been developed. However, a unified understanding of how these deconvolution approaches perform in practical applications is still lacking. To address this, we systematically assessed the accuracy and robustness of nine deconvolution methods that use single-cell RNA sequencing data as a reference, evaluating them on real bulk data with cell proportions verified through flow cytometry, as well as simulated bulk data generated from five single-cell RNA sequencing datasets. Our study highlights the importance of several factors-including reference dataset construction strategies, dataset size, cell type subdivision, and cell type inconsistency-on the accuracy and robustness of deconvolution results. We also propose a set of recommended guidelines for software users in diverse scenarios.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11789683/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143122256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges and opportunities of developing bioinformatics platforms in Africa: the case of BurkinaBioinfo at Joseph Ki-Zerbo University, Burkina Faso.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf040
Ezechiel B Tibiri, Palwende R Boua, Issiaka Soulama, Christine Dubreuil-Tranchant, Ndomassi Tando, Charlotte Tollenaere, Christophe Brugidou, Romaric K Nanema, Fidèle Tiendrebeogo
{"title":"Challenges and opportunities of developing bioinformatics platforms in Africa: the case of BurkinaBioinfo at Joseph Ki-Zerbo University, Burkina Faso.","authors":"Ezechiel B Tibiri, Palwende R Boua, Issiaka Soulama, Christine Dubreuil-Tranchant, Ndomassi Tando, Charlotte Tollenaere, Christophe Brugidou, Romaric K Nanema, Fidèle Tiendrebeogo","doi":"10.1093/bib/bbaf040","DOIUrl":"10.1093/bib/bbaf040","url":null,"abstract":"<p><p>Bioinformatics, an interdisciplinary field combining biology and computer science, enables meaningful information to be extracted from complex biological data. The exponential growth of biological data, driven by high-throughput omics technologies and advanced sequencing methods, requires robust computational resources. Worldwide, bioinformatics skills and computational clusters are essential for managing and analysing large-scale biological datasets across health, agriculture, and environmental science, which are crucial for the African continent. In Burkina Faso, the establishment of bioinformatics infrastructure has been a gradual process. Initial training initiatives between 2015-2016, including bioinformatics courses and the establishment of the BurkinaBioinfo (BBi) platform, marked significant progress. Over 250 scientists have been trained at diverse levels in bioinformatics, 105 user accounts have been created for high-performance computing access. Operational since 2019, this platform has significantly facilitated training programs for scientists and system administrators in west Africa, covering data production, introductory bioinformatics, phylogenetic analysis, and metagenomics. Financial and technical support from various sources has facilitated the rapid development of the platform to meet the growing need for bioinformatics analysis, particularly in conjunction with local 'wet labs'. Establishing a bioinformatics cluster in Burkina Faso involved identifying the needs of researchers, selecting appropriate hardware and installing the necessary bioinformatics tools. At present, the main challenges for the BBi platform include ongoing staff training in bioinformatics skills and high-level IT infrastructure management in the face of growing infrastructure demands. Despite these challenges, the establishment of a bioinformatics platform in Burkina Faso offers significant opportunities for scientific research and economic development in the country.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11789681/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143122300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransBic: bucket trend-preserving biclustering for finding local and interpretable expression patterns.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf050
Jing Li, Qinglin Mei, Chaoxia Yang, Naibo Zhu, Guojun Li
{"title":"TransBic: bucket trend-preserving biclustering for finding local and interpretable expression patterns.","authors":"Jing Li, Qinglin Mei, Chaoxia Yang, Naibo Zhu, Guojun Li","doi":"10.1093/bib/bbaf050","DOIUrl":"10.1093/bib/bbaf050","url":null,"abstract":"<p><p>Biclustering has emerged as a promising approach for analyzing high-dimensional expression data, offering unique advantages in uncovering localized co-expression patterns that traditional clustering methods often miss and thus facilitating advancements in complex disease research and other biomedical applications. However, state-of-the-art methods identify distinct patterns at the expense of losing information about specific patterns, some of which have been used to define cancer subtypes or reflect the progression of a disease or cellular processes. Additionally, these methods exhibit poor effectiveness in noisy environments. To address these limitations, we propose the bucket trend-preserving (BTP) pattern, a novel generalization of existing patterns. And we have developed an algorithm, TransBic, to extract significant biclusters of BTP-patterns. Specifically, TransBic transforms the problem into identifying common multipartite acyclic tournament subdigraphs shared by distinct subsets of acyclic tournament digraphs derived from a given expression matrix. Compared with prominent tools, TransBic demonstrates superior performance in identifying biclusters of all non-row-constant patterns, especially under noise and data fluctuations. Furthermore, TransBic successfully identifies the most disease-related pathways for type 2 diabetes (T2D), colorectal cancer, hepatocellular carcinoma, and breast cancer, outperforming other tools in this regard. Different from previous generalizations, BTP-patterns capture specific up-regulation and down-regulation dynamics. Through targeted analysis of BTP-patterns in T2D expression data, TransBic uncovers biological processes affected by disease risk factors, extending the application of trend-preserving biclustering in expression data analysis.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11794469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143188339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polygenic prediction for underrepresented populations through transfer learning by utilizing genetic similarity shared with European populations. 利用与欧洲人群的遗传相似性,通过迁移学习对代表性不足的人群进行多基因预测。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf048
Yiyang Zhu, Wenying Chen, Kexuan Zhu, Yuxin Liu, Shuiping Huang, Ping Zeng
{"title":"Polygenic prediction for underrepresented populations through transfer learning by utilizing genetic similarity shared with European populations.","authors":"Yiyang Zhu, Wenying Chen, Kexuan Zhu, Yuxin Liu, Shuiping Huang, Ping Zeng","doi":"10.1093/bib/bbaf048","DOIUrl":"10.1093/bib/bbaf048","url":null,"abstract":"<p><p>Because current genome-wide association studies are primarily conducted in individuals of European ancestry and information disparities exist among different populations, the polygenic score derived from Europeans thus exhibits poor transferability. Borrowing the idea of transfer learning, which enables the utilization of knowledge acquired from auxiliary samples to enhance learning capability in target samples, we propose transPGS, a novel polygenic score method, for genetic prediction in underrepresented populations by leveraging genetic similarity shared between the European and non-European populations while explaining the trans-ethnic difference in linkage disequilibrium (LD) and effect sizes. We demonstrate the usefulness and robustness of transPGS in elevated prediction accuracy via individual-level and summary-level simulations and apply it to seven continuous phenotypes and three diseases in the African, Chinese, and East Asian populations of the UK Biobank and Genetic Epidemiology Research Study on Adult Health and Aging cohorts. We further reveal that distinct LD and minor allele frequency patterns across ancestral groups are responsible for the dissatisfactory portability of PGS.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11794457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143188337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the potential of large language model-based chatbots in challenges of ribosome profiling data analysis: a review. 探索基于大型语言模型的聊天机器人在应对核糖体剖析数据分析挑战方面的潜力:综述。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae641
Zheyu Ding, Rong Wei, Jianing Xia, Yonghao Mu, Jiahuan Wang, Yingying Lin
{"title":"Exploring the potential of large language model-based chatbots in challenges of ribosome profiling data analysis: a review.","authors":"Zheyu Ding, Rong Wei, Jianing Xia, Yonghao Mu, Jiahuan Wang, Yingying Lin","doi":"10.1093/bib/bbae641","DOIUrl":"10.1093/bib/bbae641","url":null,"abstract":"<p><p>Ribosome profiling (Ribo-seq) provides transcriptome-wide insights into protein synthesis dynamics, yet its analysis poses challenges, particularly for nonbioinformatics researchers. Large language model-based chatbots offer promising solutions by leveraging natural language processing. This review explores their convergence, highlighting opportunities for synergy. We discuss challenges in Ribo-seq analysis and how chatbots mitigate them, facilitating scientific discovery. Through case studies, we illustrate chatbots' potential contributions, including data analysis and result interpretation. Despite the absence of applied examples, existing software underscores the value of chatbots and the large language model. We anticipate their pivotal role in future Ribo-seq analysis, overcoming limitations. Challenges such as model bias and data privacy require attention, but emerging trends offer promise. The integration of large language models and Ribo-seq analysis holds immense potential for advancing translational regulation and gene expression understanding.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11638007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142817162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FunlncModel: integrating multi-omic features from upstream and downstream regulatory networks into a machine learning framework to identify functional lncRNAs. FunlncModel:将上下游调控网络的多组学特征整合到机器学习框架中,以识别功能性 lncRNA。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae623
Yan-Yu Li, Feng-Cui Qian, Guo-Rui Zhang, Xue-Cang Li, Li-Wei Zhou, Zheng-Min Yu, Wei Liu, Qiu-Yu Wang, Chun-Quan Li
{"title":"FunlncModel: integrating multi-omic features from upstream and downstream regulatory networks into a machine learning framework to identify functional lncRNAs.","authors":"Yan-Yu Li, Feng-Cui Qian, Guo-Rui Zhang, Xue-Cang Li, Li-Wei Zhou, Zheng-Min Yu, Wei Liu, Qiu-Yu Wang, Chun-Quan Li","doi":"10.1093/bib/bbae623","DOIUrl":"10.1093/bib/bbae623","url":null,"abstract":"<p><p>Accumulating evidence indicates that long noncoding RNAs (lncRNAs) play important roles in molecular and cellular biology. Although many algorithms have been developed to reveal their associations with complex diseases by using downstream targets, the upstream (epi)genetic regulatory information has not been sufficiently leveraged to predict the function of lncRNAs in various biological processes. Therefore, we present FunlncModel, a machine learning-based interpretable computational framework, which aims to screen out functional lncRNAs by integrating a large number of (epi)genetic features and functional genomic features from their upstream/downstream multi-omic regulatory networks. We adopted the random forest method to mine nearly 60 features in three categories from >2000 datasets across 11 data types, including transcription factors (TFs), histone modifications, typical enhancers, super-enhancers, methylation sites, and mRNAs. FunlncModel outperformed alternative methods for classification performance in human embryonic stem cell (hESC) (0.95 Area Under Curve (AUROC) and 0.97 Area Under the Precision-Recall Curve (AUPRC)). It could not only infer the most known lncRNAs that influence the states of stem cells, but also discover novel high-confidence functional lncRNAs. We extensively validated FunlncModel's efficacy by up to 27 cancer-related functional prediction tasks, which involved multiple cancer cell growth processes and cancer hallmarks. Meanwhile, we have also found that (epi)genetic regulatory features, such as TFs and histone modifications, serve as strong predictors for revealing the function of lncRNAs. Overall, FunlncModel is a strong and stable prediction model for identifying functional lncRNAs in specific cellular contexts. FunlncModel is available as a web server at https://bio.liclab.net/FunlncModel/.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142738397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信