bioRxiv - Bioinformatics最新文献

筛选
英文 中文
ECSFinder: Optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences ECSFinder:从基因组序列优化预测进化保守的 RNA 二级结构
bioRxiv - Bioinformatics Pub Date : 2024-09-19 DOI: 10.1101/2024.09.14.612549
Vanda A Gaonac'h-Lovejoy, Martin Sauvageau, John S Mattick, Martin A Smith
{"title":"ECSFinder: Optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences","authors":"Vanda A Gaonac'h-Lovejoy, Martin Sauvageau, John S Mattick, Martin A Smith","doi":"10.1101/2024.09.14.612549","DOIUrl":"https://doi.org/10.1101/2024.09.14.612549","url":null,"abstract":"Accurate prediction of RNA secondary structures is essential for understanding the evolutionary conservation and functional roles of long noncoding RNAs (lncRNAs) across diverse species. In this study, we benchmarked two leading tools for predicting evolutionarily conserved RNA secondary structures (ECSs), SISSIz and R-scape, using two distinct experimental frameworks: one focusing on well-characterized mitochondrial RNA structures and the other on experimentally validated Rfam structures embedded within simulated genome alignments. While both tools performed comparably overall, each displayed subtle preferences in detecting ECSs. To address these limitations, we evaluated two interpretable machine learning approaches that integrate the strengths of both methods. By balancing thermodynamic stability features from RNALalifold and SISSIz with robust covariation metrics from R-scape, a random forest classifier significantly outperformed both conventional tools. This classifier was implemented in ECSfinder, a new tool that provides a robust, interpretable solution for genome-wide identification of conserved RNA structures, offering valuable insights into lncRNA function and evolutionary conservation. ECSfinder is designed for large-scale comparative genomics applications and promises to facilitate the discovery of novel functional RNA elements.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"188 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable high-resolution dimension reduction of spatial transcriptomics data by DeepFuseNMF 利用 DeepFuseNMF 对空间转录组学数据进行可解释的高分辨率维度缩减
bioRxiv - Bioinformatics Pub Date : 2024-09-19 DOI: 10.1101/2024.09.12.612666
Junjie Tang, Zihao Chen, Kun Qian, Siyuan Huang, Yang He, Shenyi Yin, Xinyu He, Buqing Ye, Yan Zhuang, Hongxue Meng, Jianzhong Xi, Ruibin Xi
{"title":"Interpretable high-resolution dimension reduction of spatial transcriptomics data by DeepFuseNMF","authors":"Junjie Tang, Zihao Chen, Kun Qian, Siyuan Huang, Yang He, Shenyi Yin, Xinyu He, Buqing Ye, Yan Zhuang, Hongxue Meng, Jianzhong Xi, Ruibin Xi","doi":"10.1101/2024.09.12.612666","DOIUrl":"https://doi.org/10.1101/2024.09.12.612666","url":null,"abstract":"Spatial transcriptomics (ST) technologies have revolutionized tissue architecture studies by capturing gene expression with spatial context. However, high-dimensional ST data often have limited spatial resolution and exhibit considerable noise and sparsity, posing significant challenges in deciphering subtle spatial structures and underlying biological activities. Here, we introduce DeepFuseNMF, a multi-modal dimension reduction framework that enhances spatial resolution by integrating ST gene expression with high-resolution histology images. DeepFuseNMF incorporates non-negative matrix factorization into a neural network architecture, enabling the identification of interpretable, high resolution embeddings. Furthermore, DeepFuseNMF can simultaneously analyze multiple samples and is compatible with various types of histology images. Extensive evaluations on synthetic and real ST datasets from various technologies and tissue types demonstrate that DeepFuseNMF can effectively produce highly interpretable, high-resolution embeddings, and detects refined spatial structures. DeepFuseNMF represents a powerful approach for integrating ST data and histology images, offering deeper insights into complex tissue structures and functions.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better? 生物信息学家、计算机科学家和遗传学家领导生物信息学工具的开发--哪一个更好?
bioRxiv - Bioinformatics Pub Date : 2024-09-19 DOI: 10.1101/2024.08.25.609622
Paul P. Gardner
{"title":"A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better?","authors":"Paul P. Gardner","doi":"10.1101/2024.08.25.609622","DOIUrl":"https://doi.org/10.1101/2024.08.25.609622","url":null,"abstract":"The development of accurate bioinformatic software tools is crucial for the effective analysis of complex biological data. This study examines the relationship between the academic department affiliations of authors and the accuracy of the bioinformatic tools they develop. By analyzing a corpus of previously benchmarked bioinformatic software tools, we mapped bioinformatic tools to the academic fields of the corresponding authors and evaluated tool accuracy by field. Our results suggest that \"Medical Informatics\" outperforms all other fields in bioinformatic software accuracy, with a mean proportion of wins in accuracy rankings exceeding the null expectation. In contrast, tools developed by authors affiliated with \"Bioinformatics\" and \"Engineering\" fields tend to be less accurate. However, after correcting for multiple testing, no result is statistically significant (<em>p</em>&gt;0.05). Our findings reveal no strong association between academic field and bioinformatic software accuracy. These findings suggest that the development of interdisciplinary software applications can be effectively undertaken by any department with sufficient resources and training.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeneSpectra: a method for context-aware comparison of cell type gene expression across species GeneSpectra:一种对不同物种细胞类型基因表达进行上下文感知比较的方法
bioRxiv - Bioinformatics Pub Date : 2024-09-19 DOI: 10.1101/2024.06.21.600109
Yuyao Song, Irene Papatheodorou, Alvis Brazma
{"title":"GeneSpectra: a method for context-aware comparison of cell type gene expression across species","authors":"Yuyao Song, Irene Papatheodorou, Alvis Brazma","doi":"10.1101/2024.06.21.600109","DOIUrl":"https://doi.org/10.1101/2024.06.21.600109","url":null,"abstract":"Computational comparison of single cell expression profiles cross-species uncovers functional similarities and differences between cell types. Importantly, it offers the potential to refine evolutionary relationships based on gene expression. Current analysis strategies are limited by the strong hypothesis of ortholog conjecture, which implies that orthologs have similar cell type expression patterns. They also lose expression information from non-orthologs, making them inapplicable in practice for large evolutionary distances. To address these limitations, we devised a novel analytical framework, GeneSpectra, to robustly classify genes by their expression specificity and distribution across cell types. This framework allows for the generalization of the ortholog conjecture by evaluating the degree of ortholog class conservation. We utilise different gene classes to decode species effects on cross-species transcriptomics space and compare sequence conservation with expression specificity similarity across different types of orthologs. We develop contextualised cell type similarity measurements while considering species-unique genes and non-one-to-one orthologs. Finally, we consolidate gene classification results into a knowledge graph, GeneSpectraKG, allowing a hierarchical depiction of cell types and orthologous groups, while continuously integrating new data.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The "very moment" when UDG recognizes aflipped-out uracil base in dsDNA UDG 识别 dsDNA 中翻转的尿嘧啶碱基的 "非常时刻"
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.13.612628
Vinnarasi Saravanan, Nessim Raouraoua, Guillaume Brysbaert, Stefano Giordano, Marc F Lensink, Fabrizio Cleri, Ralf Blossey
{"title":"The \"very moment\" when UDG recognizes aflipped-out uracil base in dsDNA","authors":"Vinnarasi Saravanan, Nessim Raouraoua, Guillaume Brysbaert, Stefano Giordano, Marc F Lensink, Fabrizio Cleri, Ralf Blossey","doi":"10.1101/2024.09.13.612628","DOIUrl":"https://doi.org/10.1101/2024.09.13.612628","url":null,"abstract":"Uracil-DNA glycosylase (UDG) is the first enzyme in the base-excision repair (BER) pathway, acting on uracil bases in DNA. How UDG finds its targets has not been conclusively resolved yet. Based on available structural and other experimental evidence, two possible pathways are under discussion. In one, the action of UDG on the DNA bases is believed to follow a \"pinch-push-pull\" model, in which UDG generates the base-flip in an active manner. A second scenario is based on the exploitation of bases flipping out thermally from the DNA. Recent molecular dynamics (MD) studies of DNA in trinucleosome arrays have shown that base-flipping can be readily induced by the action of mechanical forces on DNA alone. This alternative mechanism could possibly enhance the probability for the second scnenario of UDG- uracil interaction via the formation of a recognition complex of UDG with flipped-out base. In this work we describe DNA structures with flipped-out uracil bases generated by MD simulations which we then subject to docking simulations with the UDG enzyme. Our results for the UDG-uracil recognition complex support the view that base-flipping induced by DNA mechanics can be a relevant mechanism of uracil base recognition by the UDG glycosylase in chromatin.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"138 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
metagWGS, a comprehensive workflow to analyze metagenomic data using Illumina or PacBio HiFi reads metagWGS,使用 Illumina 或 PacBio HiFi 读数分析元基因组数据的综合工作流程
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.13.612854
Jean Mainguy, Mäina Vienne, Joanna Fourquet, Vincent Darbot, Céline Noirot, Adrien Castinel, Sylvie Combes, Christine Gaspin, Denis Milan, Cecile Donnadieu, Carole Iampietro, Olivier Bouchez, Géraldine Pascal, Claire Hoede
{"title":"metagWGS, a comprehensive workflow to analyze metagenomic data using Illumina or PacBio HiFi reads","authors":"Jean Mainguy, Mäina Vienne, Joanna Fourquet, Vincent Darbot, Céline Noirot, Adrien Castinel, Sylvie Combes, Christine Gaspin, Denis Milan, Cecile Donnadieu, Carole Iampietro, Olivier Bouchez, Géraldine Pascal, Claire Hoede","doi":"10.1101/2024.09.13.612854","DOIUrl":"https://doi.org/10.1101/2024.09.13.612854","url":null,"abstract":"Background: To study communities of micro-organisms taxonomically and functionally, metagenomic analyses are now often used. If there is no reference gene catalogue, a de novo approach is required. Because genomes are easier to interpret than contigs, the recovery of metagenome-assembled genomes (MAGs) by binning of contigs from metagenomic data has recently become a common task for microbial studies. However, during this process, there is a significant loss of information between the assembly and the binning of contigs. This is why it is important to produce taxonomic and functional matrices for all contigs and not just those included in correct bins. In addition, Pacbio HiFi reads (long and of good quality) are now a possible, albeit more expensive, alternative to short Illumina reads. We therefore developed a workflow that is easy to install with dependencies fixed using singularity images and easy to use on a computing cluster, that is capable of analyzing either short or long reads, and that should allow analysis at the contig and/or bin level, depending on the user's choice. Following is a presentation of metagWGS, a fully automated workflow for metagenomic data analysis. It uses a new tool for refining bins (called Binette) that we will demonstrate is more efficient than competing tools. Methods: metagWGS is a Nextflow workflow distributed with two singularity images and complete documentation to facilitate its installation and use. Because the main original features of metagWGS concern binning (short and long reads) and the analysis of HiFi reads, we compared metagWGS with the MAG construction workflow proposed by PacBio to a public dataset used by Pacbio to promote its workflow. Results: metagWGS differs from existing workflows by (i) offering flexible approaches for the assembly; (ii) supporting short reads (Illumina) or PacBio HiFi reads; (iii) combining multiple binning algorithms with a new bin refinement tool, referred to as Binette, to achieve high-quality genome bins; and (iv) providing taxonomic and functional annotation for all genes, all contigs built and bins. metagWGS produces more medium (708) and high-quality (255) bins on 11 public metagenomic samples from human gut data than the Pacbio HiFi dedicated workflow, referred to as the HiFi-MAGS-pipeline (659 medium quality bins and 231 high quality bins), primarily due to the better performance of Binette.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"186 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PANOMIQ: A Unified Approach to Whole-Genome, Exome, and Microbiome Data Analysis PANOMIQ:全基因组、外显子组和微生物组数据分析的统一方法
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.17.613203
Shivani Srivastava, Saba Ehsan, Linkon Chowdhury, Muhammad Omar Faruk, Abhishek Singh, Anmol S Kapoor, Sidharth Bhinder, Mohan P Singh, Divya Mishra
{"title":"PANOMIQ: A Unified Approach to Whole-Genome, Exome, and Microbiome Data Analysis","authors":"Shivani Srivastava, Saba Ehsan, Linkon Chowdhury, Muhammad Omar Faruk, Abhishek Singh, Anmol S Kapoor, Sidharth Bhinder, Mohan P Singh, Divya Mishra","doi":"10.1101/2024.09.17.613203","DOIUrl":"https://doi.org/10.1101/2024.09.17.613203","url":null,"abstract":"The integration of whole-genome sequencing (WGS), whole-exome sequencing (WES), and microbiome analysis has become essential for advancing our understanding of complex biological systems. However, the fragmented nature of current analytical tools often complicates the process, leading to inefficiencies and potential data loss. To address this challenge, we present PANOMIQ, a comprehensive software solution that unifies the analysis of WGS, WES, and microbiome data into a single, streamlined pipeline. PANOMIQ is designed to facilitate the entire analysis process from raw data to interpretable results. It is the fastest algorithm that can achieve results much more quickly compared to traditional pipeline approaches of WGS and WES analysis. It incorporates advanced algorithms for high-accuracy variant calling in both WGS and WES, along with robust tools for characterizing microbial communities. The software's modular architecture allows for seamless integration of these diverse data types, enabling researchers to uncover complex interactions between host genomics and microbiomes. In this study, we demonstrate the capabilities of PANOMIQ by applying it to a series of datasets encompassing a wide range of applications, including disease association studies and environmental microbiome profiling. Our results highlight PANOMIQ's ability to deliver comprehensive insights, significantly reducing the time and computational resources required for multi-omic analysis. By providing a unified platform for WGS, WES, and microbiome analysis, PANOMIQ offers a powerful tool for researchers aiming to explore the full spectrum of genomic and microbial diversity. This software not only simplifies the analytical workflow but also enhances the depth of biological interpretation, paving the way for more integrated and holistic studies in genomics and microbiology.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FunCoup 6: advancing functional association networks across species with directed links and improved user experience FunCoup 6:通过定向链接推进跨物种功能关联网络并改善用户体验
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.13.612391
Davide Buzzao, Emma Persson, Dimitri Guala, Erik L L Sonnhammer
{"title":"FunCoup 6: advancing functional association networks across species with directed links and improved user experience","authors":"Davide Buzzao, Emma Persson, Dimitri Guala, Erik L L Sonnhammer","doi":"10.1101/2024.09.13.612391","DOIUrl":"https://doi.org/10.1101/2024.09.13.612391","url":null,"abstract":"FunCoup 6 (https://funcoup6.scilifelab.se/, will be https://funcoup.org after publication) represents a significant advancement in global functional association networks, aiming to provide researchers with a comprehensive view of the functional coupling interactome. This update introduces novel methodologies and integrated tools for improved network inference and analysis. Major new developments in FunCoup 6 include vastly expanding the coverage of gene regulatory links, a new framework for bin-free Bayesian training, and a new website. FunCoup 6 integrates a new tool for disease and drug target module identification using the TOPAS algorithm. To expand the utility of the resource for biomedical research, it incorporates pathway enrichment analysis using the ANUBIX and EASE algorithms. The unique comparative interactomics analysis in FunCoup provides insights of network conservation, now allowing users to align orthologs only or query each species network independently. Bin-free training was applied to 23 primary species, and in addition networks were generated for all remaining 618 species in InParanoiDB 9. Accompanying these advancements, FunCoup 6 features a new redesigned website, together with updated API functionalities, and represents a pivotal step forward in functional genomics research, offering unique capabilities for exploring the complex landscape of protein interactions.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Iterative Approach to Polish the Nanopore Sequencing Basecalling for Therapeutic RNA Quality Control 迭代法打磨用于治疗 RNA 质量控制的纳米孔测序基线信号
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.12.612711
Ziyuan Wang, Mei-Juan Tu, Ziyang Liu, Katherine K Wang, Yinshan Fang, Ning Hao, Hao Helen Zhang, Jianwen Que, Xiaoxiao Sun, Ai-Ming Yu, HONGXU DING
{"title":"An Iterative Approach to Polish the Nanopore Sequencing Basecalling for Therapeutic RNA Quality Control","authors":"Ziyuan Wang, Mei-Juan Tu, Ziyang Liu, Katherine K Wang, Yinshan Fang, Ning Hao, Hao Helen Zhang, Jianwen Que, Xiaoxiao Sun, Ai-Ming Yu, HONGXU DING","doi":"10.1101/2024.09.12.612711","DOIUrl":"https://doi.org/10.1101/2024.09.12.612711","url":null,"abstract":"Nucleotide modifications deviate nanopore sequencing readouts, therefore generating artifacts during the basecalling of sequence backbones. Here, we present an iterative approach to polish modification-disturbed basecalling results. We show such an approach is able to promote the basecalling accuracy of both artificially-synthesized and real-world molecules. With demonstrated efficacy and reliability, we exploit the approach to precisely basecall therapeutic RNAs consisting of artificial or natural modifications, as the basis for quantifying the purity and integrity of vaccine mRNAs which are transcribed in vitro, and for determining modification hotspots of novel therapeutic RNA interference (RNAi) molecules which are bioengineered (BioRNA) in vivo.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RCoxNet: deep learning framework for enhanced cancer survival prediction integrating random walk with restart with mutation and clinical data RCoxNet:用于增强癌症生存预测的深度学习框架,将随机漫步与突变和临床数据重新开始整合在一起
bioRxiv - Bioinformatics Pub Date : 2024-09-18 DOI: 10.1101/2024.09.17.613428
Stuti Kumari, Sakshi Gujral, Smruti Panda, Prashant Gupta, Gaurav Ahuja, Debarka Sengupta
{"title":"RCoxNet: deep learning framework for enhanced cancer survival prediction integrating random walk with restart with mutation and clinical data","authors":"Stuti Kumari, Sakshi Gujral, Smruti Panda, Prashant Gupta, Gaurav Ahuja, Debarka Sengupta","doi":"10.1101/2024.09.17.613428","DOIUrl":"https://doi.org/10.1101/2024.09.17.613428","url":null,"abstract":"Cancer poses a significant global health challenge, characterized by a complex disease progression and disrupted growth regulation. A thorough understanding of cellular and molecular biological mechanisms is essential for developing novel treatments and improving the accuracy of patient survival predictions. While prior studies have leveraged gene expression and clinical data to forecast survival outcomes through current machine learning and deep learning approaches, gene mutation data despite being a widely recognized metric has rarely been incorporated due to its limited information, inadequate representation of gene relationships, and data sparsity, which negatively affects the robustness, effectiveness, and interpretability of current survival analysis approaches. To overcome the challenges of mutation data sparsity, we propose RCoxNet, a novel deep learning neural network framework that integrates the Random Walk with Restart (RWR) algorithm with a deep learning Cox Proportional Hazards model. By applying this framework to mutation data from cBioportal, our model achieved an average concordance index of 0.62+-0.05 across four cancer types, outperforming existing deep neural network models. Additionally, we identified clinical features critical for differentiating between predicted high- and low-risk patients, with the relevance of these features being partially supported by previous studies.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信