Muhammad A Nawaz, Igor E Pamirsky, Kirill S Golokhvast
{"title":"Bioinformatics in Russia: history and present-day landscape.","authors":"Muhammad A Nawaz, Igor E Pamirsky, Kirill S Golokhvast","doi":"10.1093/bib/bbae513","DOIUrl":"10.1093/bib/bbae513","url":null,"abstract":"<p><p>Bioinformatics has become an interdisciplinary subject due to its universal role in molecular biology research. The current status of Russia's bioinformatics research in Russia is not known. Here, we review the history of bioinformatics in Russia, present the current landscape, and highlight future directions and challenges. Bioinformatics research in Russia is driven by four major industries: information technology, pharmaceuticals, biotechnology, and agriculture. Over the past three decades, despite a delayed start, the field has gained momentum, especially in protein and nucleic acid research. Dedicated and shared centers for genomics, proteomics, and bioinformatics are active in different regions of Russia. Present-day bioinformatics in Russia is characterized by research issues related to genetics, metagenomics, OMICs, medical informatics, computational biology, environmental informatics, and structural bioinformatics. Notable developments are in the fields of software (tools, algorithms, and pipelines), use of high computation power (e.g. by the Siberian Supercomputer Center), and large-scale sequencing projects (the sequencing of 100 000 human genomes). Government funding is increasing, policies are being changed, and a National Genomic Information Database is being established. An increased focus on eukaryotic genome sequencing, the development of a common place for developers and researchers to share tools and data, and the use of biological modeling, machine learning, and biostatistics are key areas for future focus. Universities and research institutes have started to implement bioinformatics modules. A critical mass of bioinformaticians is essential to catch up with the global pace in the discipline.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11473191/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mut-Map: Comprehensive Computational Pipeline for Structural Mapping and Analysis of Cancer-Associated Mutations.","authors":"Ali F Alsulami","doi":"10.1093/bib/bbae514","DOIUrl":"https://doi.org/10.1093/bib/bbae514","url":null,"abstract":"<p><p>Understanding the functional impact of genetic mutations on protein structures is essential for advancing cancer research and developing targeted therapies. The main challenge lies in accurately mapping these mutations to protein structures and analysing their effects on protein function. To address this, Mut-Map (https://genemutation.org/) is a comprehensive computational pipeline designed to integrate mutation data from the Catalogue Of Somatic Mutations In Cancer database with protein structural data from the Protein Data Bank and AlphaFold models. The pipeline begins by taking a UniProt ID and proceeds through mapping corresponding Protein Data Bank structures, renumbering residues, and assessing disorder percentages. It then overlays mutation data, categorizes mutations based on structural context, and visualizes them using advanced tools like MolStar. This approach allows for a detailed analysis of how mutations may disrupt protein function by affecting key regions such as DNA interfaces, ligand-binding sites, and dimer interactions. To validate the pipeline, a case study on the TP53 gene, a critical tumour suppressor often mutated in cancers, was conducted. The analysis highlighted the most frequent mutations occurring at the DNA-binding interface, providing insights into their potential role in cancer progression. Mut-Map offers a powerful resource for elucidating the structural implications of cancer-associated mutations, paving the way for more targeted therapeutic strategies and advancing our understanding of protein structure-function relationships.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized patient-specific immune checkpoint inhibitor therapies for cancer treatment based on tumor immune microenvironment modeling.","authors":"Yao Yao, Youhua Frank Chen, Qingpeng Zhang","doi":"10.1093/bib/bbae547","DOIUrl":"https://doi.org/10.1093/bib/bbae547","url":null,"abstract":"<p><p>Enhancing patient response to immune checkpoint inhibitors (ICIs) is crucial in cancer immunotherapy. We aim to create a data-driven mathematical model of the tumor immune microenvironment (TIME) and utilize deep reinforcement learning (DRL) to optimize patient-specific ICI therapy combined with chemotherapy (ICC). Using patients' genomic and transcriptomic data, we develop an ordinary differential equations (ODEs)-based TIME dynamic evolutionary model to characterize interactions among chemotherapy, ICIs, immune cells, and tumor cells. A DRL agent is trained to determine the personalized optimal ICC therapy. Numerical experiments with real-world data demonstrate that the proposed TIME model can predict ICI therapy response. The DRL-derived personalized ICC therapy outperforms predefined fixed schedules. For tumors with extremely low CD8 + T cell infiltration ('extremely cold tumors'), the DRL agent recommends high-dosage chemotherapy alone. For tumors with higher CD8 + T cell infiltration ('cold' and 'hot tumors'), an appropriate chemotherapy dosage induces CD8 + T cell proliferation, enhancing ICI therapy outcomes. Specifically, for 'hot tumors', chemotherapy and ICI are administered simultaneously, while for 'cold tumors', a mid-dosage of chemotherapy makes the TIME 'hotter' before ICI administration. However, in several 'cold tumors' with rapid resistant tumor cell growth, ICC eventually fails. This study highlights the potential of utilizing real-world clinical data and DRL algorithm to develop personalized optimal ICC by understanding the complex biological dynamics of a patient's TIME. Our ODE-based TIME dynamic evolutionary model offers a theoretical framework for determining the best use of ICI, and the proposed DRL agent may guide personalized ICC schedules.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11503752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142495341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IPFMC: an iterative pathway fusion approach for enhanced multi-omics clustering in cancer research.","authors":"Haoyang Zhang, Sha Liu, Bingxin Li, Xionghui Zhou","doi":"10.1093/bib/bbae541","DOIUrl":"10.1093/bib/bbae541","url":null,"abstract":"<p><p>Using multi-omics data for clustering (cancer subtyping) is crucial for precision medicine research. Despite numerous methods having been proposed, current approaches either do not perform satisfactorily or lack biological interpretability, limiting the practical application of these methods. Based on the biological hypothesis that patients with the same subtype may exhibit similar dysregulated pathways, we developed an Iterative Pathway Fusion approach for enhanced Multi-omics Clustering (IPFMC), a novel multi-omics clustering method involving two data fusion stages. In the first stage, omics data are partitioned at each layer using pathway information, with crucial pathways iteratively selected to represent samples. Ultimately, the representation information from multiple pathways is integrated. In the second stage, similarity network fusion was applied to integrate the representation information from multiple omics. Comparative experiments with nine cancer datasets from The Cancer Genome Atlas (TCGA), involving systematic comparisons with 10 representative methods, reveal that IPFMC outperforms these methods. Additionally, the biological pathways and genes identified by our approach hold biological significance, affirming not only its excellent clustering performance but also its biological interpretability.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11514061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang
{"title":"Multimodal contrastive learning for spatial gene expression prediction using histology images.","authors":"Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang","doi":"10.1093/bib/bbae551","DOIUrl":"https://doi.org/10.1093/bib/bbae551","url":null,"abstract":"<p><p>In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effective strategy involves employing artificial intelligence to predict gene expression levels using readily accessible whole-slide images stained with Hematoxylin and Eosin (H&E). However, existing methods have yet to fully capitalize on multimodal information provided by H&E images and ST data with spatial location. In this paper, we propose mclSTExp, a multimodal contrastive learning with Transformer and Densenet-121 encoder for Spatial Transcriptomics Expression prediction. We conceptualize each spot as a \"word\", integrating its intrinsic features with spatial context through the self-attention mechanism of a Transformer encoder. This integration is further enriched by incorporating image features via contrastive learning, thereby enhancing the predictive capability of our model. We conducted an extensive evaluation of highly variable genes in two breast cancer datasets and a skin squamous cell carcinoma dataset, and the results demonstrate that mclSTExp exhibits superior performance in predicting spatial gene expression. Moreover, mclSTExp has shown promise in interpreting cancer-specific overexpressed genes, elucidating immune-related genes, and identifying specialized spatial domains annotated by pathologists. Our source code is available at https://github.com/shizhiceng/mclSTExp.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shisheng Wang, Wenjuan Zeng, Yin Yang, Jingqiu Cheng, Dan Liu, Hao Yang
{"title":"DEWNA: dynamic entropy weight network analysis and its application to the DNA-binding proteome in A549 cells with cisplatin-induced damage.","authors":"Shisheng Wang, Wenjuan Zeng, Yin Yang, Jingqiu Cheng, Dan Liu, Hao Yang","doi":"10.1093/bib/bbae564","DOIUrl":"10.1093/bib/bbae564","url":null,"abstract":"<p><p>Cisplatin is one of the most commonly used chemotherapy drugs for treating solid tumors. As a genotoxic agent, cisplatin binds to DNA and forms platinum-DNA adducts that cause DNA damage and activate a series of signaling pathways mediated by various DNA-binding proteins (DBPs), ultimately leading to cell death. Therefore, DBPs play crucial roles in the cellular response to cisplatin and in determining cell fate. However, systematic studies of DBPs responding to cisplatin damage and their temporal dynamics are still lacking. To address this, we developed a novel and user-friendly stand-alone software, DEWNA, designed for dynamic entropy weight network analysis to reveal the dynamic changes of DBPs and their functions. DEWNA utilizes the entropy weight method, multiscale embedded gene co-expression network analysis and generalized reporter score-based analysis to process time-course proteome expression data, helping scientists identify protein hubs and pathway entropy profiles during disease progression. We applied DEWNA to a dataset of DBPs from A549 cells responding to cisplatin-induced damage across 8 time points, with data generated by data-independent acquisition mass spectrometry (DIA-MS). The results demonstrate that DEWNA can effectively identify protein hubs and associated pathways that are significantly altered in response to cisplatin-induced DNA damage, and offer a comprehensive view of how different pathways interact and respond dynamically over time to cisplatin treatment. Notably, we observed the dynamic activation of distinct DNA repair pathways and cell death mechanisms during the drug treatment time course, providing new insights into the molecular mechanisms underlying the cellular response to DNA damage.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530294/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142563963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Bosilj, Alen Suljič, Samo Zakotnik, Jan Slunečko, Rok Kogoj, Misa Korva
{"title":"MetaAll: integrative bioinformatics workflow for analysing clinical metagenomic data.","authors":"Martin Bosilj, Alen Suljič, Samo Zakotnik, Jan Slunečko, Rok Kogoj, Misa Korva","doi":"10.1093/bib/bbae597","DOIUrl":"10.1093/bib/bbae597","url":null,"abstract":"<p><p>Over the past decade, there have been many improvements in the field of metagenomics, including sequencing technologies, advances in bioinformatics and the development of reference databases, but a one-size-fits-all sequencing and bioinformatics pipeline does not yet seem achievable. In this study, we address the bioinformatics part of the analysis by combining three methods into a three-step workflow that increases the sensitivity and specificity of clinical metagenomics and improves pathogen detection. The individual tools are combined into a user-friendly workflow suitable for analysing short paired-end (PE) and long reads from metagenomics datasets-MetaAll. To demonstrate the applicability of the developed workflow, four complicated clinical cases with different disease presentations and multiple samples collected from different biological sites as well as the CAMI Clinical pathogen detection challenge dataset were used. MetaAll was able to identify putative pathogens in all but one case. In this case, however, traditional microbiological diagnostics were also unsuccessful. In addition, co-infection with Haemophilus influenzae and Human rhinovirus C54 was detected in case 1 and co-infection with SARS-Cov-2 and Influenza A virus (FluA) subtype H3N2 was detected in case 3. In case 2, in which conventional diagnostics could not find a pathogen, mNGS pointed to Klebsiella pneumoniae as the suspected pathogen. Finally, this study demonstrated the importance of combining read classification, contig validation and targeted reference mapping for more reliable detection of infectious agents in clinical metagenome samples.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568877/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142643856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aliaa E Ali, Li-Li Li, Michael J Courtney, Olli T Pentikäinen, Pekka A Postila
{"title":"Atomistic simulations reveal impacts of missense mutations on the structure and function of SynGAP1.","authors":"Aliaa E Ali, Li-Li Li, Michael J Courtney, Olli T Pentikäinen, Pekka A Postila","doi":"10.1093/bib/bbae458","DOIUrl":"10.1093/bib/bbae458","url":null,"abstract":"<p><p>De novo mutations in the synaptic GTPase activating protein (SynGAP) are associated with neurological disorders like intellectual disability, epilepsy, and autism. SynGAP is also implicated in Alzheimer's disease and cancer. Although pathogenic variants are highly penetrant in neurodevelopmental conditions, a substantial number of them are caused by missense mutations that are difficult to diagnose. Hence, in silico mutagenesis was performed for probing the missense effects within the N-terminal region of SynGAP structure. Through extensive molecular dynamics simulations, encompassing three 150-ns replicates for 211 variants, the impact of missense mutations on the protein fold was assessed. The effect of the mutations on the folding stability was also quantitatively assessed using free energy calculations. The mutations were categorized as potentially pathogenic or benign based on their structural impacts. Finally, the study introduces wild-type-SynGAP in complex with RasGTPase at the inner membrane, while considering the potential effects of mutations on these key interactions. This study provides structural perspective to the clinical assessment of SynGAP missense variants and lays the foundation for future structure-based drug discovery.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11418247/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Lin, Siqi Jiang, Le Gao, Zhi Wei, Junwen Wang
{"title":"MultiSC: a deep learning pipeline for analyzing multiomics single-cell data.","authors":"Xiang Lin, Siqi Jiang, Le Gao, Zhi Wei, Junwen Wang","doi":"10.1093/bib/bbae492","DOIUrl":"https://doi.org/10.1093/bib/bbae492","url":null,"abstract":"<p><p>Single-cell technologies enable researchers to investigate cell functions at an individual cell level and study cellular processes with higher resolution. Several multi-omics single-cell sequencing techniques have been developed to explore various aspects of cellular behavior. Using NEAT-seq as an example, this method simultaneously obtains three kinds of omics data for each cell: gene expression, chromatin accessibility, and protein expression of transcription factors (TFs). Consequently, NEAT-seq offers a more comprehensive understanding of cellular activities in multiple modalities. However, there is a lack of tools available for effectively integrating the three types of omics data. To address this gap, we propose a novel pipeline called MultiSC for the analysis of MULTIomic Single-Cell data. Our pipeline leverages a multimodal constraint autoencoder (single-cell hierarchical constraint autoencoder) to integrate the multi-omics data during the clustering process and a matrix factorization-based model (scMF) to predict target genes regulated by a TF. Moreover, we utilize multivariate linear regression models to predict gene regulatory networks from the multi-omics data. Additional functionalities, including differential expression, mediation analysis, and causal inference, are also incorporated into the MultiSC pipeline. Extensive experiments were conducted to evaluate the performance of MultiSC. The results demonstrate that our pipeline enables researchers to gain a comprehensive view of cell activities and gene regulatory networks by fully leveraging the potential of multiomics single-cell data. By employing MultiSC, researchers can effectively integrate and analyze diverse omics data types, enhancing their understanding of cellular processes.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bioinformatics approaches for studying molecular sex differences in complex diseases.","authors":"Rebecca Ting Jiin Loo, Mohamed Soudy, Francesco Nasta, Mirco Macchi, Enrico Glaab","doi":"10.1093/bib/bbae499","DOIUrl":"https://doi.org/10.1093/bib/bbae499","url":null,"abstract":"<p><p>Many complex diseases exhibit pronounced sex differences that can affect both the initial risk of developing the disease, as well as clinical disease symptoms, molecular manifestations, disease progression, and the risk of developing comorbidities. Despite this, computational studies of molecular data for complex diseases often treat sex as a confounding variable, aiming to filter out sex-specific effects rather than attempting to interpret them. A more systematic, in-depth exploration of sex-specific disease mechanisms could significantly improve our understanding of pathological and protective processes with sex-dependent profiles. This survey discusses dedicated bioinformatics approaches for the study of molecular sex differences in complex diseases. It highlights that, beyond classical statistical methods, approaches are needed that integrate prior knowledge of relevant hormone signaling interactions, gene regulatory networks, and sex linkage of genes to provide a mechanistic interpretation of sex-dependent alterations in disease. The review examines and compares the advantages, pitfalls and limitations of various conventional statistical and systems-level mechanistic analyses for this purpose, including tailored pathway and network analysis techniques. Overall, this survey highlights the potential of specialized bioinformatics techniques to systematically investigate molecular sex differences in complex diseases, to inform biomarker signature modeling, and to guide more personalized treatment approaches.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11471957/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}