Bioinformatics (Oxford, England)最新文献

筛选
英文 中文
OrthologAL: A Shiny application for quality- aware humanization of non-human pre-clinical high-dimensional gene expression data. 正交:一个闪亮的应用质量意识的人性化非人类临床前高维基因表达数据。
Bioinformatics (Oxford, England) Pub Date : 2025-05-20 DOI: 10.1093/bioinformatics/btaf311
Rishika Chowdary, Robert K Suter, Matthew D'Antuono, Cynthia Gomes, Joshua Stein, Ki-Bum Lee, Jae K Lee, Nagi G Ayad
{"title":"OrthologAL: A Shiny application for quality- aware humanization of non-human pre-clinical high-dimensional gene expression data.","authors":"Rishika Chowdary, Robert K Suter, Matthew D'Antuono, Cynthia Gomes, Joshua Stein, Ki-Bum Lee, Jae K Lee, Nagi G Ayad","doi":"10.1093/bioinformatics/btaf311","DOIUrl":"10.1093/bioinformatics/btaf311","url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell and spatial transcriptomics provide unprecedented insight into diseases. Pharmacotranscriptomic approaches are powerful tools that leverage gene expression data for drug repurposing and discovery. Multiple databases attempt to connect human cellular transcriptional responses to small molecules for use in transcriptome-based drug discovery efforts. However, preclinical research often requires in vivo experiments in non-human species, which makes utilizing such valuable resources difficult. To facilitate both human orthologous conversion of non-human transcriptomes and the application of pharmacotranscriptomic databases to pre-clinical research models, we introduce OrthologAL. OrthologAL interfaces with BioMart to access different gene sets from the Ensembl database, allowing for ortholog conversion without the need for user-generated code.</p><p><strong>Results: </strong>Researchers can input their single-cell or other high-dimensional gene expression data from any species as a Seurat object, and OrthologAL will output a human ortholog-converted Seurat object for download and use. To demonstrate the utility of this application, we tested OrthologAL using single-cell, single-nuclei, and spatial transcriptomic data derived from common preclinical models, including patient-derived orthotopic xenografts of medulloblastoma, and mouse and rat models of spinal cord injury. OrthologAL can convert these data types efficiently to that of corresponding orthologs while preserving the dimensional architecture of the original non-human expression data. OrthologAL will be broadly useful for the simple conversion of Seurat objects and for applying preclinical, high-dimensional transcriptomics data to functional human-derived small molecule predictions.</p><p><strong>Availability: </strong>OrthologAL is available for download as an R package with functions to launch the Shiny GUI at https://github.com/AyadLab/OrthologAL or via Zenodo at https://doi.org/10.5281/zenodo.15225041. The medulloblastoma single-cell transcriptomics data were downloaded from the NCBI Gene Expression Omnibus with the identifier GSE129730. 10X Visium data of medulloblastoma PDX mouse models from Vo et al. were acquired by contacting the authors, and the raw data are available from ArrayExpress under the identifier E-MTAB-11720. The single-cell and single-nuclei transcriptomics data of rat and mouse spinal-cord injury were acquired from the Gene Expression Omnibus under the identifiers GSE213240 and GSE234774.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beacon Reconstruction Attack: Reconstruction of genomes in genomic data-sharing beacons using summary statistics. 信标重建攻击:利用汇总统计对基因组数据共享信标中的基因组进行重建。
Bioinformatics (Oxford, England) Pub Date : 2025-05-19 DOI: 10.1093/bioinformatics/btaf273
Kousar Saleem, A Ercument Cicek, Sinem Sav
{"title":"Beacon Reconstruction Attack: Reconstruction of genomes in genomic data-sharing beacons using summary statistics.","authors":"Kousar Saleem, A Ercument Cicek, Sinem Sav","doi":"10.1093/bioinformatics/btaf273","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf273","url":null,"abstract":"<p><strong>Motivation: </strong>Genomic data sharing beacon protocol, developed by the Global Alliance for Genomics and Health (GA4GH), offers a privacy-preserving mechanism for querying genomic datasets while restricting direct data access. Despite their design, beacons remain vulnerable to privacy attacks. This study introduces a novel privacy vulnerability of the protocol: One can reconstruct large portions of the genomes of all beacon participants by only using the summary statistics reported by the protocol.</p><p><strong>Results: </strong>We introduce a novel optimization-based algorithm that leverages beacon responses and single nucleotide polymorphism (SNP) correlations for reconstruction. By optimizing for the SNP correlations and allele frequencies, the proposed approach achieves genome reconstruction with a substantially higher F1-score (70%) compared to baseline methods (45%) on beacons generated using individuals from the HapMap and OpenSNP datasets. We show that reconstructed genomes can be used by downstream applications such as in membership inference attacks against other beacons. Our findings reveal that beacons releasing allele frequencies substantially increases the reconstruction risk, underscoring the need for enhanced privacy-preserving mechanisms to protect genomic data.</p><p><strong>Availability and implementation: </strong>Our implementation is available at https://github.com/ASAP-Bilkent/Beacon-Reconstruction-Attack.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144096202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepProtein: Deep Learning Library and Benchmark for Protein Sequence Learning. DeepProtein:蛋白质序列学习的深度学习库和基准。
Bioinformatics (Oxford, England) Pub Date : 2025-05-19 DOI: 10.1093/bioinformatics/btaf165
Jiaqing Xie, Tianfan Fu
{"title":"DeepProtein: Deep Learning Library and Benchmark for Protein Sequence Learning.","authors":"Jiaqing Xie, Tianfan Fu","doi":"10.1093/bioinformatics/btaf165","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf165","url":null,"abstract":"<p><strong>Motivation: </strong>Deep learning has deeply influenced protein science, enabling breakthroughs in predicting protein properties, higher-order structures, and molecular interactions.</p><p><strong>Results: </strong>This paper introduces DeepProtein, a comprehensive and user-friendly deep learning library tailored for protein-related tasks. It enables researchers to seamlessly address protein data with cutting-edge deep learning models. To assess model performance, we establish a benchmark that evaluates different deep learning architectures across multiple protein-related tasks, including protein function prediction, subcellular localization prediction, protein-protein interaction prediction, and protein structure prediction. Furthermore, we introduce DeepProt-T5, a series of fine-tuned Prot-T5-based models that achieve state-of-the-art performance on four benchmark tasks, while demonstrating competitive results on six of others. Comprehensive documentation and tutorials are available which could ensure accessibility and support reproducibility.</p><p><strong>Availability and implementation: </strong>Built upon the widely used drug discovery library DeepPurpose, DeepProtein is publicly available at https://github.com/jiaqingxie/DeepProtein.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144096206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HABiC: an algorithm based on the exact computation of the Kantorovich-Rubinstein optimizer for binary classification in transcriptomics. HABiC:一种基于精确计算的Kantorovich-Rubinstein优化器的算法,用于转录组学中的二进制分类。
Bioinformatics (Oxford, England) Pub Date : 2025-05-19 DOI: 10.1093/bioinformatics/btaf310
Chiara Cordier, Pascal Jézéquel, Mario Campone, Fabien Panloup, Agnes Basseville
{"title":"HABiC: an algorithm based on the exact computation of the Kantorovich-Rubinstein optimizer for binary classification in transcriptomics.","authors":"Chiara Cordier, Pascal Jézéquel, Mario Campone, Fabien Panloup, Agnes Basseville","doi":"10.1093/bioinformatics/btaf310","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf310","url":null,"abstract":"<p><strong>Motivation: </strong>Machine learning analyses of molecular omics datasets largely drive the development of precision medicine in oncology, but mathematical challenges still hamper their application in the clinic. In particular, omics-based learning relies on high dimensional data with high degrees of freedom and multicollinearity issues, requiring more tailored algorithms. Here, we have developed a prediction algorithm that relies on the 1-Wasserstein distance to better capture complex relationships between variables, and that is built on a decision rule based on the exact computation of the Kantorovich-Rubinstein optimizer to increase the algorithm precision. We explored dimension reduction and aggregation methods to improve its robustness. The exact method was compared with a neural network-based approximate method, as well as with standards Euclidean distance-based classifiers.</p><p><strong>Results: </strong>Experimental results on synthetic datasets with multiple scenarios of redundant/informative variables revealed that exact and approximate methods based on Wasserstein distance outperformed state-of-the-art algorithms when class information was spread across a large number of variables. When predicting clinical or biological outcomes from transcriptomics datasets, HABiC achieved consistently higher accuracy in most of situations.</p><p><strong>Availability and implementation: </strong>Python code for HABiC classifier is available at https://github.com/chiaraco/HABiC.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144096210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepAllo: Allosteric Site Prediction using Protein Language Model (pLM) with Multitask Learning. DeepAllo:使用多任务学习的蛋白质语言模型(pLM)进行变构位点预测。
Bioinformatics (Oxford, England) Pub Date : 2025-05-15 DOI: 10.1093/bioinformatics/btaf294
Moaaz Khokhar, Ozlem Keskin, Attila Gursoy
{"title":"DeepAllo: Allosteric Site Prediction using Protein Language Model (pLM) with Multitask Learning.","authors":"Moaaz Khokhar, Ozlem Keskin, Attila Gursoy","doi":"10.1093/bioinformatics/btaf294","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf294","url":null,"abstract":"<p><strong>Motivation: </strong>Allostery, the process by which binding at one site perturbs a distant site, is being rendered as a key focus in the field of drug development with its substantial impact on protein function. The identification of allosteric pockets (sites) is a challenging task and several techniques have been developed, including Machine Learning (ML) to predict allosteric pockets that utilize both static and pocket features.</p><p><strong>Results: </strong>Our work, DeepAllo, is the first study that combines fine-tuned protein language model (pLM) with FPocket features and shows an increase in prediction performance of allosteric sites over previous studies. The pLM model was fine-tuned on Allosteric Dataset (ASD) in Multitask Learning (MTL) setting and was further used as a feature extractor to train XGBoost and AutoML models. The best model predicts allosteric pockets with 89.66% F1 score and 90.5% of allosteric pockets in the top 3 positions, outperforming previous results. A case study has been performed on proteins with known allosteric pockets, which shows the proof of our approach. Moreover, an effort was made to explain the pLM by visualizing its attention mechanism among allosteric and non-allosteric residues.</p><p><strong>Availability: </strong>The source code is available on GitHub (https://github.com/MoaazK/deepallo) and archived on Zenodo (DOI: 10.5281/zenodo.15255379). The trained model is hosted on Hugging Face (DOI: 10.57967/hf/5198). The dataset used for training and evaluation is archived on Zenodo (DOI: 10.5281/zenodo.15255437).</p><p><strong>Supplementary information: </strong>Supplementary data, including the full list of proteins used in the study with their PDB IDs, t-SNE analysis of pocket features, confusion matrix breakdown, and interpretation of borderline classification cases are available as supplementary material along this article.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASMC: investigating the amino acid diversity of enzyme active sites. ASMC:研究酶活性位点的氨基酸多样性。
Bioinformatics (Oxford, England) Pub Date : 2025-05-15 DOI: 10.1093/bioinformatics/btaf307
Thomas Bailly, Eddy Elisée, David Vallenet
{"title":"ASMC: investigating the amino acid diversity of enzyme active sites.","authors":"Thomas Bailly, Eddy Elisée, David Vallenet","doi":"10.1093/bioinformatics/btaf307","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf307","url":null,"abstract":"<p><strong>Motivation: </strong>The analysis of enzyme active sites is essential for understanding their activity in terms of catalyzed reaction and substrate specificity, providing insights for engineering to obtain targeted properties or modify the substrate scope. In 2010, a first version of the Active Site Modeling and Clustering (ASMC) workflow was published. ASMC predicts isofunctional clusters from enzyme families, based on structural modeling and clustering of active sites. Since then, structure- and sequence-based methods have developed considerably.</p><p><strong>Results: </strong>We present here a redesign of the ASMC workflow. This new major version includes recent pocket prediction, structural alignment and clustering methods, as well as a refined amino acid distance matrix, thereby improving the relevance of results and reducing the need for laborious manual analysis to obtain relevant clusters. In addition, we have implemented multiple sequence alignment (MSA) as a possible input for the clustering step, along with an additional script to compare 2D and 3D active sites. Finally, the code has been unified from three to one programming language (Python) to facilitate its installation and maintenance. This new version of ASMC was evaluated on a set of protein families, resulting in overall better performances compared to its original version.</p><p><strong>Availability and implementation: </strong>ASMC is supported on Linux operating system and freely available at https://github.com/labgem/ASMC, along with a complete documentation (wiki, tutorial).</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SOMMD: An R Package for the Analysis of Molecular Dynamics Simulations using Self-Organising Map. SOMMD:一个用自组织图分析分子动力学模拟的R包。
Bioinformatics (Oxford, England) Pub Date : 2025-05-15 DOI: 10.1093/bioinformatics/btaf308
Stefano Motta, Lara Callea, Shaziya Ismail Mulla, Hamid Davoudkhani, Laura Bonati, Alessandro Pandini
{"title":"SOMMD: An R Package for the Analysis of Molecular Dynamics Simulations using Self-Organising Map.","authors":"Stefano Motta, Lara Callea, Shaziya Ismail Mulla, Hamid Davoudkhani, Laura Bonati, Alessandro Pandini","doi":"10.1093/bioinformatics/btaf308","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf308","url":null,"abstract":"<p><strong>Motivation: </strong>Molecular Dynamics (MD) simulations provide critical insights into biomolecular processes but they generate complex high-dimensional data that are often difficult to interpret directly. Dimensionality reduction methods like Principal Component Analysis (PCA), Time-Lagged Independent Component Analysis (TICA) and Self-Organising Maps (SOMs) have helped in extracting essential information on functional dynamics. However, there is a growing need for a user-friendly and flexible framework for SOM-based analyses of MD simulations. Such a framework should offer adaptable workflows, customizable options, and direct integration with a widely adopted analysis software.</p><p><strong>Results: </strong>We designed and developed SOMMD, an R package to streamline MD analysis workflows. SOMMD facilitates the interpretation of atomistic trajectories through SOMs, providing tools for each stage of the workflow, from importing a wide range of MD trajectories data types to generating enhanced visualizations. The package also includes three example projects that demonstrate how SOM can be applied in real-world scenarios, including cluster analysis, pathways mapping and transition networks reconstruction.</p><p><strong>Availability: </strong>SOMMD is available on CRAN (https://CRAN.R-project.org/package=SOMMD) and on GitHub (https://github.com/alepandini/SOMMD).</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Ustekinumab Treatment Response in Crohn's Disease Using Pre-Treatment Biopsy Images. 使用治疗前活检图像预测克罗恩病的乌斯特金单抗治疗反应
Bioinformatics (Oxford, England) Pub Date : 2025-05-14 DOI: 10.1093/bioinformatics/btaf301
Chengfei Cai, Ruidong Chen, Jieyu Chen, Jun Li, Caiyun Lv, Yiping Jiao, Lanqing Wu, Juan Chen, Qi Sun, Qianyun Shi, Jun Xu, Tang Wen, Yao Liu
{"title":"Predicting Ustekinumab Treatment Response in Crohn's Disease Using Pre-Treatment Biopsy Images.","authors":"Chengfei Cai, Ruidong Chen, Jieyu Chen, Jun Li, Caiyun Lv, Yiping Jiao, Lanqing Wu, Juan Chen, Qi Sun, Qianyun Shi, Jun Xu, Tang Wen, Yao Liu","doi":"10.1093/bioinformatics/btaf301","DOIUrl":"10.1093/bioinformatics/btaf301","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Motivation: &lt;/strong&gt;Crohn's disease (CD) exhibits substantial variability in response to biological therapies such as ustekinumab (UST), a monoclonal antibody targeting interleukin-12/23. However, predicting individual treatment responses remains difficult due to the lack of reliable histopathological biomarkers and the morphological complexity of tissue. While recent deep learning methods have leveraged whole-slide images (WSIs), most lack effective mechanisms for selecting relevant regions and integrating patch-level evidence into robust patient-level predictions. Therefore, A framework that captures local histological cues and global tissue context is needed to improve prediction performance.Ustekinumab (UST) is a relatively recent biologic agent used in the treatment of Crohn's Disease (CD). Clinical studies on the treatment response of UST are relatively scarce. However, its efficacy varies among CD patients, highlighting the need for accurate to prediction of its treatment response. In this paper, We developed an artificial intelligence (AI) model based on whole-slide images (WSIs) and weakly supervised learning to predict the treatment response of UST in CD patients.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;We propose a novel clustering-enhanced weakly supervised learning framework to predict UST treatment response from pre-treatment WSIs of CD patients. First, patches from WSIs were encoded using a pre-trained vision foundation model, and k-means clustering was applied to identify representative morphological patterns. Discriminative patches associated with treatment outcomes were selected via a DenseNet-based classifier, with Grad-CAM used to enhance interpretability. To aggregate patch-level predictions, we adopted a multi-instance learning approach, from which whole-slide features were extracted using both patch likelihood histograms and bag-of-words representations. These features were subsequently used to train a classifier for final response prediction. Experimental results on an independent test set demonstrated that our WSI-level model achieved superior predictive performance with an AUC of 0.938 (95% CI: 0.879-0.996), sensitivity of 0.951, and specificity of 0.825, outperforming baseline patch-level models. These findings suggest that our method enables accurate, interpretable, and scalable prediction of biological therapy response in CD, potentially supporting personalized treatment strategies in clinical settings.402 tissue samples from CD patients treated with UST were categorized into non-response and response groups based on clinical outcomes. Initially, we selected relevant patches from WSIs, then patch-level treatment efficacy predictions were constructed using deep learning methods. Subsequently, pathological features generated by patches predict results aggregation were combined with various machine learning algorithms to develop a WSI-level AI model. This enables automatic prediction of UST treatment response for CD ","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144002077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CrossAttOmics: Multi-Omics data integration with CrossAttention. crosssattomics:多组学数据集成与交叉注意。
Bioinformatics (Oxford, England) Pub Date : 2025-05-13 DOI: 10.1093/bioinformatics/btaf302
Aurélien Beaude, Franck Augé, Farida Zehraoui, Blaise Hanczar
{"title":"CrossAttOmics: Multi-Omics data integration with CrossAttention.","authors":"Aurélien Beaude, Franck Augé, Farida Zehraoui, Blaise Hanczar","doi":"10.1093/bioinformatics/btaf302","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf302","url":null,"abstract":"<p><strong>Motivation: </strong>Advances in high throughput technologies enabled large access to various types of omics. Each omics provides a partial view of the underlying biological process. Integrating multiple omics layers would help have a more accurate diagnosis. However, the complexity of omics data requires approaches that can capture complex relationships. One way to accomplish this is by exploiting the known regulatory links between the different omics, which could help in constructing a better multimodal representation.</p><p><strong>Results: </strong>In this article, we propose CrossAttOmics, a new deep-learning architecture based on the cross-attention mechanism for multi-omics integration. Each modality is projected in a lower dimensional space with its specific encoder. Interactions between modalities with known regulatory links are computed in the feature representation space with cross-attention. The results of different experiments carried out in this paper show that our model can accurately predict the types of cancer by exploiting the interactions between multiple modalities. CrossAttOmics outperforms other methods when there are few paired training examples. Our approach can be combined with attribution methods like LRP to identify which interactions are the most important.</p><p><strong>Availability: </strong>The code is available at https://github.com/Sanofi-Public/CrossAttOmics and https://doi.org/10.5281/zenodo.15065928. TCGA data can be downloaded from the Genomic Data Commons Data Portal. CCLE data can be downloaded from the depmap portal.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144043652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relation Equivariant Graph Neural Networks to Explore the Mosaic-like Tissue Architecture of Kidney Diseases on Spatially Resolved Transcriptomics. 关系等变图神经网络在空间分解转录组学上探索肾脏疾病的镶嵌样组织结构。
Bioinformatics (Oxford, England) Pub Date : 2025-05-13 DOI: 10.1093/bioinformatics/btaf303
Mauminah Raina, Hao Cheng, Ricardo Melo Ferreira, Treyden Stansfield, Chandrima Modak, Ying-Hua Cheng, Hari Naga Sai Kiran Suryadevara, Dong Xu, Michael T Eadon, Qin Ma, Juexin Wang
{"title":"Relation Equivariant Graph Neural Networks to Explore the Mosaic-like Tissue Architecture of Kidney Diseases on Spatially Resolved Transcriptomics.","authors":"Mauminah Raina, Hao Cheng, Ricardo Melo Ferreira, Treyden Stansfield, Chandrima Modak, Ying-Hua Cheng, Hari Naga Sai Kiran Suryadevara, Dong Xu, Michael T Eadon, Qin Ma, Juexin Wang","doi":"10.1093/bioinformatics/btaf303","DOIUrl":"https://doi.org/10.1093/bioinformatics/btaf303","url":null,"abstract":"<p><strong>Motivation: </strong>Chronic kidney disease (CKD) and Acute Kidney Injury (AKI) are prominent public health concerns affecting more than 15% of the global population. The ongoing development of spatially resolved transcriptomics (SRT) technologies presents a promising approach for discovering the spatial distribution patterns of gene expression within diseased tissues. However, existing computational tools are predominantly calibrated and designed on the ribbon-like structure of the brain cortex, presenting considerable computational obstacles in discerning highly heterogeneous mosaic-like tissue architectures in the kidney. Consequently, timely and cost-effective acquisition of annotation and interpretation in the kidney remains a challenge in exploring the cellular and morphological changes within renal tubules and their interstitial niches.</p><p><strong>Results: </strong>We present an empowered graph deep learning framework, REGNN (Relation Equivariant Graph Neural Networks), designed for SRT data analyses on heterogeneous tissue structures. To increase expressive power in the SRT lattice using graph modeling, REGNN integrates equivariance to handle n-dimensional symmetries of the spatial area, while additionally leveraging Positional Encoding to strengthen relative spatial relations of the nodes uniformly distributed in the lattice. Given the limited availability of well-labeled spatial data, this framework implements both graph autoencoder and graph self-supervised learning strategies. On heterogeneous samples from different kidney conditions, REGNN outperforms existing computational tools in identifying tissue architectures within the 10X Visium platform. This framework offers a powerful graph deep learning tool for investigating tissues within highly heterogeneous expression patterns and paves the way to pinpoint underlying pathological mechanisms that contribute to the progression of complex diseases.</p><p><strong>Availability: </strong>REGNN is publicly available at https://github.com/Mraina99/REGNN.</p><p><strong>Supplementary information: </strong>Found in the attached supplementary file 'SupplementaryFile_ManuscriptBioinformatics'.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144060817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信