Chaorui Yan, Aoyun Geng, Zhuoyu Pan, Zilong Zhang, Feifei Cui
{"title":"MultiFeatVotPIP: a voting-based ensemble learning framework for predicting proinflammatory peptides.","authors":"Chaorui Yan, Aoyun Geng, Zhuoyu Pan, Zilong Zhang, Feifei Cui","doi":"10.1093/bib/bbae505","DOIUrl":"https://doi.org/10.1093/bib/bbae505","url":null,"abstract":"<p><p>Inflammatory responses may lead to tissue or organ damage, and proinflammatory peptides (PIPs) are signaling peptides that can induce such responses. Many diseases have been redefined as inflammatory diseases. To identify PIPs more efficiently, we expanded the dataset and designed an ensemble learning model with manually encoded features. Specifically, we adopted a more comprehensive feature encoding method and considered the actual impact of certain features to filter them. Identification and prediction of PIPs were performed using an ensemble learning model based on five different classifiers. The results show that the model's sensitivity, specificity, accuracy, and Matthews correlation coefficient are all higher than those of the state-of-the-art models. We named this model MultiFeatVotPIP, and both the model and the data can be accessed publicly at https://github.com/ChaoruiYan019/MultiFeatVotPIP. Additionally, we have developed a user-friendly web interface for users, which can be accessed at http://www.bioai-lab.com/MultiFeatVotPIP.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11479713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142486031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Ruscone, Andrea Checcoli, Randy Heiland, Emmanuel Barillot, Paul Macklin, Laurence Calzone, Vincent Noël
{"title":"Building multiscale models with PhysiBoSS, an agent-based modeling tool.","authors":"Marco Ruscone, Andrea Checcoli, Randy Heiland, Emmanuel Barillot, Paul Macklin, Laurence Calzone, Vincent Noël","doi":"10.1093/bib/bbae509","DOIUrl":"10.1093/bib/bbae509","url":null,"abstract":"<p><p>Multiscale models provide a unique tool for analyzing complex processes that study events occurring at different scales across space and time. In the context of biological systems, such models can simulate mechanisms happening at the intracellular level such as signaling, and at the extracellular level where cells communicate and coordinate with other cells. These models aim to understand the impact of genetic or environmental deregulation observed in complex diseases, describe the interplay between a pathological tissue and the immune system, and suggest strategies to revert the diseased phenotypes. The construction of these multiscale models remains a very complex task, including the choice of the components to consider, the level of details of the processes to simulate, or the fitting of the parameters to the data. One additional difficulty is the expert knowledge needed to program these models in languages such as C++ or Python, which may discourage the participation of non-experts. Simplifying this process through structured description formalisms-coupled with a graphical interface-is crucial in making modeling more accessible to the broader scientific community, as well as streamlining the process for advanced users. This article introduces three examples of multiscale models which rely on the framework PhysiBoSS, an add-on of PhysiCell that includes intracellular descriptions as continuous time Boolean models to the agent-based approach. The article demonstrates how to construct these models more easily, relying on PhysiCell Studio, the PhysiCell Graphical User Interface. A step-by-step tutorial is provided as Supplementary Material and all models are provided at https://physiboss.github.io/tutorial/.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11489466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yourui Han, Bolin Chen, Jun Bian, Ruiming Kang, Xuequn Shang
{"title":"Cancerous time estimation for interpreting the evolution of lung adenocarcinoma.","authors":"Yourui Han, Bolin Chen, Jun Bian, Ruiming Kang, Xuequn Shang","doi":"10.1093/bib/bbae520","DOIUrl":"https://doi.org/10.1093/bib/bbae520","url":null,"abstract":"<p><p>The evolution of lung adenocarcinoma is accompanied by a multitude of gene mutations and dysfunctions, rendering its phenotypic state and evolutionary direction highly complex. To interpret the evolution of lung adenocarcinoma, various methods have been developed to elucidate the molecular pathogenesis and functional evolution processes. However, most of these methods are constrained by the absence of cancerous temporal information, and the challenges of heterogeneous characteristics. To handle these problems, in this study, a patient quasi-potential landscape method was proposed to estimate the cancerous time of phenotypic states' emergence during the evolutionary process. Subsequently, a total of 39 different oncogenetic paths were identified based on cancerous time and mutations, reflecting the molecular pathogenesis of the evolutionary process of lung adenocarcinoma. To interpret the evolution patterns of lung adenocarcinoma, three oncogenetic graphs were obtained as the common evolutionary patterns by merging the oncogenetic paths. Moreover, patients were evenly re-divided into early, middle, and late evolutionary stages according to cancerous time, and a feasible framework was developed to construct the functional evolution network of lung adenocarcinoma. A total of six significant functional evolution processes were identified from the functional evolution network based on the pathway enrichment analysis, which plays critical roles in understanding the development of lung adenocarcinoma.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483137/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianyi Chen, Xindian Wei, Lianxin Xie, Yunfei Zhang, Cheng Liu, Wenjun Shen, Si Wu, Hau-San Wong
{"title":"SELF-Former: multi-scale gene filtration transformer for single-cell spatial reconstruction.","authors":"Tianyi Chen, Xindian Wei, Lianxin Xie, Yunfei Zhang, Cheng Liu, Wenjun Shen, Si Wu, Hau-San Wong","doi":"10.1093/bib/bbae523","DOIUrl":"https://doi.org/10.1093/bib/bbae523","url":null,"abstract":"<p><p>The spatial reconstruction of single-cell RNA sequencing (scRNA-seq) data into spatial transcriptomics (ST) is a rapidly evolving field that addresses the significant challenge of aligning gene expression profiles to their spatial origins within tissues. This task is complicated by the inherent batch effects and the need for precise gene expression characterization to accurately reflect spatial information. To address these challenges, we developed SELF-Former, a transformer-based framework that utilizes multi-scale structures to learn gene representations, while designing spatial correlation constraints for the reconstruction of corresponding ST data. SELF-Former excels in recovering the spatial information of ST data and effectively mitigates batch effects between scRNA-seq and ST data. A novel aspect of SELF-Former is the introduction of a gene filtration module, which significantly enhances the spatial reconstruction task by selecting genes that are crucial for accurate spatial positioning and reconstruction. The superior performance and effectiveness of SELF-Former's modules have been validated across four benchmark datasets, establishing it as a robust and effective method for spatial reconstruction tasks. SELF-Former demonstrates its capability to extract meaningful gene expression information from scRNA-seq data and accurately map it to the spatial context of real ST data. Our method represents a significant advancement in the field, offering a reliable approach for spatial reconstruction.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483138/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AntigenBoost: enhanced mRNA-based antigen expression through rational amino acid substitution.","authors":"Yumiao Gao, Siran Zhu, Huichun Li, Xueting Hao, Wen Chen, Deng Pan, Zhikang Qian","doi":"10.1093/bib/bbae468","DOIUrl":"https://doi.org/10.1093/bib/bbae468","url":null,"abstract":"<p><p>Messenger RNA (mRNA) vaccines represent a groundbreaking advancement in immunology and public health, particularly highlighted by their role in combating the COVID-19 pandemic. Optimizing mRNA-based antigen expression is a crucial focus in this emerging industry. We have developed a bioinformatics tool named AntigenBoost to address the challenge posed by destabilizing dipeptides that hinder ribosomal translation. AntigenBoost identifies these dipeptides within specific antigens and provides a range of potential amino acid substitution strategies using a two-dimensional scoring system. Through a combination of bioinformatics analysis and experimental validation, we significantly enhanced the in vitro expression of mRNA-derived Respiratory Syncytial Virus fusion glycoprotein and Influenza A Hemagglutinin antigen. Notably, a single amino acid substitution improved the immune response in mice, underscoring the effectiveness of AntigenBoost in mRNA vaccine design.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11472322/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constructing the dynamic transcriptional regulatory networks to identify phenotype-specific transcription regulators.","authors":"Yang Guo, Zhiqiang Xiao","doi":"10.1093/bib/bbae542","DOIUrl":"https://doi.org/10.1093/bib/bbae542","url":null,"abstract":"<p><p>The transcriptional regulatory network (TRN) is a graph framework that helps understand the complex transcriptional regulation mechanisms in the transcription process. Identifying the phenotype-specific transcription regulators is vital to reveal the functional roles of transcription elements in associating the specific phenotypes. Although many methods have been developed towards detecting the phenotype-specific transcription elements based on the static TRN in the past decade, most of them are not satisfactory for elucidating the phenotype-related functional roles of transcription regulators in multiple levels, as the dynamic characteristics of transcription regulators are usually ignored in static models. In this study, we introduce a novel framework called DTGN to identify the phenotype-specific transcription factors (TFs) and pathways by constructing dynamic TRNs. We first design a graph autoencoder model to integrate the phenotype-oriented time-series gene expression data and static TRN to learn the temporal representations of genes. Then, based on the learned temporal representations of genes, we develop a statistical method to construct a series of dynamic TRNs associated with the development of specific phenotypes. Finally, we identify the phenotype-specific TFs and pathways from the constructed dynamic TRNs. Results from multiple phenotypic datasets show that the proposed DTGN framework outperforms most existing methods in identifying phenotype-specific TFs and pathways. Our framework offers a new approach to exploring the functional roles of transcription regulators that associate with specific phenotypes in a dynamic model.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11503644/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142495337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Shi, Qiaohui Liu, Qingrong Ji, Qisheng He, Xing-Ming Zhao
{"title":"MicroHDF: predicting host phenotypes with metagenomic data using a deep forest-based framework.","authors":"Kai Shi, Qiaohui Liu, Qingrong Ji, Qisheng He, Xing-Ming Zhao","doi":"10.1093/bib/bbae530","DOIUrl":"10.1093/bib/bbae530","url":null,"abstract":"<p><p>The gut microbiota plays a vital role in human health, and significant effort has been made to predict human phenotypes, especially diseases, with the microbiota as a promising indicator or predictor with machine learning (ML) methods. However, the accuracy is impacted by a lot of factors when predicting host phenotypes with the metagenomic data, e.g. small sample size, class imbalance, high-dimensional features, etc. To address these challenges, we propose MicroHDF, an interpretable deep learning framework to predict host phenotypes, where a cascade layers of deep forest units is designed for handling sample class imbalance and high dimensional features. The experimental results show that the performance of MicroHDF is competitive with that of existing state-of-the-art methods on 13 publicly available datasets of six different diseases. In particular, it performs best with the area under the receiver operating characteristic curve of 0.9182 ± 0.0098 and 0.9469 ± 0.0076 for inflammatory bowel disease (IBD) and liver cirrhosis, respectively. Our MicroHDF also shows better performance and robustness in cross-study validation. Furthermore, MicroHDF is applied to two high-risk diseases, IBD and autism spectrum disorder, as case studies to identify potential biomarkers. In conclusion, our method provides an effective and reliable prediction of the host phenotype and discovers informative features with biological insights.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500453/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142516299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miao Cui, Yadong Liu, Xian Yu, Hongzhe Guo, Tao Jiang, Yadong Wang, Bo Liu
{"title":"miniSNV: accurate and fast single nucleotide variant calling from nanopore sequencing data.","authors":"Miao Cui, Yadong Liu, Xian Yu, Hongzhe Guo, Tao Jiang, Yadong Wang, Bo Liu","doi":"10.1093/bib/bbae473","DOIUrl":"https://doi.org/10.1093/bib/bbae473","url":null,"abstract":"<p><p>Nanopore sequence technology has demonstrated a longer read length and enabled to potentially address the limitations of short-read sequencing including long-range haplotype phasing and accurate variant calling. However, there is still room for improvement in terms of the performance of single nucleotide variant (SNV) identification and computing resource usage for the state-of-the-art approaches. In this work, we introduce miniSNV, a lightweight SNV calling algorithm that simultaneously achieves high performance and yield. miniSNV utilizes known common variants in populations as variation backgrounds and leverages read pileup, read-based phasing, and consensus generation to identify and genotype SNVs for Oxford Nanopore Technologies (ONT) long reads. Benchmarks on real and simulated ONT data under various error profiles demonstrate that miniSNV has superior sensitivity and comparable accuracy on SNV detection and runs faster with outstanding scalability and lower memory than most state-of-the-art variant callers. miniSNV is available from https://github.com/CuiMiao-HIT/miniSNV.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11428505/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142341946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiwei Rong, Jiali Song, Yipei Yu, Lan Mi, ManTang Qiu, Yuqin Song, Yan Hou
{"title":"Single-cell mosaic integration and cell state transfer with auto-scaling self-attention mechanism.","authors":"Zhiwei Rong, Jiali Song, Yipei Yu, Lan Mi, ManTang Qiu, Yuqin Song, Yan Hou","doi":"10.1093/bib/bbae540","DOIUrl":"https://doi.org/10.1093/bib/bbae540","url":null,"abstract":"<p><p>The integration of data from multiple modalities generated by single-cell omics technologies is crucial for accurately identifying cell states. One challenge in comprehending multi-omics data resides in mosaic integration, in which different data modalities are profiled in different subsets of cells, as it requires simultaneous batch effect removal and modality alignment. Here, we develop Multi-omics Mosaic Auto-scaling Attention Variational Inference (mmAAVI), a scalable deep generative model for single-cell mosaic integration. Leveraging auto-scaling self-attention mechanisms, mmAAVI can map arbitrary combinations of omics to the common embedding space. If existing well-annotated cell states, the model can perform semisupervised learning to utilize existing these annotations. We validated the performance of mmAAVI and five other commonly used methods on four benchmark datasets, which vary in cell numbers, omics types, and missing patterns. mmAAVI consistently demonstrated its superiority. We also validated mmAAVI's ability for cell state knowledge transfer, achieving balanced accuracies of 0.82 and 0.97 with less 1% labeled cells between batches with completely different omics. The full package is available at https://github.com/luyiyun/mmAAVI.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495875/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142495366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingjun Ji, Qing Yu, Xin-Zhuang Yang, Xianhong Yu, Jiaxin Wang, Chunfu Xiao, Ni A An, Chuanhui Han, Chuan-Yun Li, Wanqiu Ding
{"title":"Long-range alternative splicing contributes to neoantigen specificity in glioblastoma.","authors":"Mingjun Ji, Qing Yu, Xin-Zhuang Yang, Xianhong Yu, Jiaxin Wang, Chunfu Xiao, Ni A An, Chuanhui Han, Chuan-Yun Li, Wanqiu Ding","doi":"10.1093/bib/bbae503","DOIUrl":"https://doi.org/10.1093/bib/bbae503","url":null,"abstract":"<p><p>Recent advances in neoantigen research have accelerated the development of immunotherapies for cancers, such as glioblastoma (GBM). Neoantigens resulting from genomic mutations and dysregulated alternative splicing have been studied in GBM. However, these studies have primarily focused on annotated alternatively-spliced transcripts, leaving non-annotated transcripts largely unexplored. Circular ribonucleic acids (circRNAs), abnormally regulated in tumors, are correlated with the presence of non-annotated linear transcripts with exon skipping events. But the extent to which these linear transcripts truly exist and their functions in cancer immunotherapies remain unknown. Here, we found the ubiquitous co-occurrence of circRNA biogenesis and alternative splicing across various tumor types, resulting in large amounts of long-range alternatively-spliced transcripts (LRs). By comparing tumor and healthy tissues, we identified tumor-specific LRs more abundant in GBM than in normal tissues and other tumor types. This may be attributable to the upregulation of the protein quaking in GBM, which is reported to promote circRNA biogenesis. In total, we identified 1057 specific and recurrent LRs in GBM. Through in silico translation prediction and MS-based immunopeptidome analysis, 16 major histocompatibility complex class I-associated peptides were identified as potential immunotherapy targets in GBM. This study revealed long-range alternatively-spliced transcripts specifically upregulated in GBM may serve as recurrent, immunogenic tumor-specific antigens.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11472750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142458375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}