{"title":"Author index Volume 22 (2024).","authors":"","doi":"10.1142/S0219720024990014","DOIUrl":"https://doi.org/10.1142/S0219720024990014","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 6","pages":"2499001"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ASAP-DTA: Predicting drug-target binding affinity with adaptive structure aware networks.","authors":"Weibin Ding, Shaohua Jiang, Ting Xu, Zhijian Lyu","doi":"10.1142/S0219720024500288","DOIUrl":"10.1142/S0219720024500288","url":null,"abstract":"<p><p>The prediction of drug-target affinity (DTA) is crucial for efficiently identifying potential targets for drug repurposing, thereby reducing resource wastage. In this paper, we propose a novel graph-based deep learning model for DTA that leverages adaptive structure-aware pooling for graph processing. Our approach integrates a self-attention mechanism with an enhanced graph neural network to capture the significance of each node in the graph, marking a significant advancement in graph feature extraction. Specifically, adjacent nodes in the 2D molecular graph are aggregated into clusters, with the features of these clusters weighted according to their attention scores to form the final molecular representation. In terms of model architecture, we utilize both global and hierarchical pooling, and assess the performance of the model on multiple benchmark datasets. The evaluation results on the KIBA dataset show that our model achieved the lowest mean squared error (MSE) of 0.126, which is a 0.5% reduction compared to the best-performing baseline method. Additionally, to validate the generalization capabilities of the model, we conduct comparative experiments on regression and binary classification tasks. The results demonstrate that our model outperforms previous models in both types of tasks.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 6","pages":"2450028"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on similarity retrieval method based on mass spectral entropy.","authors":"Li-Ping Wu, Li Yong, Xiang Cheng, Yang Zhou","doi":"10.1142/S0219720024500276","DOIUrl":"10.1142/S0219720024500276","url":null,"abstract":"<p><p>Compound identification in small molecule research relies on comparing experimental mass spectra with mass spectral databases. However, unequal data lengths often lead to inefficient and inaccurate retrieval. Moreover, the similarity calculation methods used by commercial software have limitations. To address these issues, two mass spectrometry data processing methods namely the \"splicing-filling method\" and the \"matching-filling method\" have been proposed. In addition, an information entropy-based similarity calculation method for mass spectra is presented. The alignment method converts mass spectra of different lengths for unknown and known compounds into equal-length mass spectra, allowing more accurate calculation of similarities between mass spectra. Information entropy measurements are used to quantify the differences in intensity distributions in the aligned mass spectral data, which are then used to compare the degree of similarity between different mass spectra. The results of the example validation show that the two data alignment methods can effectively solve the problem of unequal lengths of mass spectral data in similarity calculation. The results of the mass spectral entropy method are reliable and suitable for the identification of mass spectra.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 6","pages":"2450027"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mateusz Twardawa, Kaja Gutowska, Piotr Formanowicz
{"title":"Exploring relationship between hypercholesterolemia and instability of atherosclerotic plaque - An approach based on a matrix population model.","authors":"Mateusz Twardawa, Kaja Gutowska, Piotr Formanowicz","doi":"10.1142/S021972002450029X","DOIUrl":"10.1142/S021972002450029X","url":null,"abstract":"<p><p><b>Background:</b> Cardiovascular diseases have long been studied to identify their causal factors and counteract them effectively. Atherosclerosis, an inflammatory process of the blood vessel wall, is a common cardiovascular disease. Among the many well-known risk factors, hypercholesterolemia is undoubtedly a significant condition for atherosclerotic plaque formation and is linked to atherosclerosis on many levels, i.e. cell interactions, cytokines levels, diet, and lifestyle. Current studies suggest that controlling balance between proinflammatory (<i>M</i>1) and anti-inflammatory (<i>M</i>2) types of macrophages may be used for patient condition improvement and necrotic core reduction. <b>Methods:</b> This study considered the effects of hypercholesterolemia on the population dynamics of macrophages (<i>M</i>0, <i>M</i>1, <i>M</i>2, foam cells) in atherosclerotic plaque. A mathematical model using a matrix approach to population dynamics was proposed and tested in various scenarios. In order to check model sensitivity and variability associated with error propagation, the uncertainty analysis was performed based on the Monte Carlo approach. <b>Results:</b> Simulations of macrophage population dynamics provided the assessment of necrotic core development and plaque instability. Excess lipid levels emerged as the most critical factor for necrotic core development. However, plaque growth can be significantly slowed if macrophages and foam cells can maintain proper lipid levels. This balance may be disrupted by proinflammatory lipids that eventually will increase plaque size, what is also reflected by <i>M</i>1/<i>M</i>2 dynamics. <b>Conclusion:</b> Hypercholesterolemia accelerates atherosclerosis development, leading to earlier cardiovascular incidents. <i>In silico</i> results suggest that reducing lipid intake and portion of proinflammatory lipids is crucial to slowing plaque development and reducing rupture risk, all of which requires preserving fragile <i>M</i>1/<i>M</i>2 balance. Targeting the inflammatory microenvironment and macrophage polarization represents a promising approach for atherosclerosis management.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 6","pages":"2450029"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rahim Berahmand, Masoumeh Emadpour, Mokhtar Jalali Javaran, Kaveh Haji-Allahverdipoor, Ali Akbarabadi
{"title":"Molecular dynamics simulations of ribosome-binding sites in theophylline-responsive riboswitch associated with improving the gene expression regulation in chloroplasts.","authors":"Rahim Berahmand, Masoumeh Emadpour, Mokhtar Jalali Javaran, Kaveh Haji-Allahverdipoor, Ali Akbarabadi","doi":"10.1142/S0219720024500239","DOIUrl":"10.1142/S0219720024500239","url":null,"abstract":"<p><p>The existence of an efficient inducible transgene expression system is a valuable tool in recombinant protein production. The synthetic theophylline-responsive riboswitch (theo.RS) can be replaced in the 5[Formula: see text] untranslated region of an mRNA and control the translation of downstream gene in chloroplasts in response to the binding with a ligand molecule, theophylline. One of the drawbacks associated with the efficiency of the theo.RS is the leak in the RS structure allowing undesired background translation when the switch is expected to be off. The purpose of this study was to detect the factors causing the leak of the theo.RS in the off mode, using molecular dynamics (MD) simulations the appropriate balancing of the simulation system, using the necessary commands, a 40[Formula: see text]ns simulation was conducted. Analysis of the solvent-accessible surface area for both ribosome-binding site (RBS) regions indicated that nucleotide 79 of the theo.RS, a guanine, had the highest surface exposure to ribosome access. These results were verified with the study of hydrogen bonding of RBS regions with the RNA structure. Therefore, redesigning the RBS regions and avoiding the unmasked nucleotide(s) in the structure may improve the tightness of theo.RS in off mode resulting in the efficient inhibition of translation.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 5","pages":"2450023"},"PeriodicalIF":0.9,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142688505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Li, Boran Wang, Zengding Wu, Shiliang Ji, Shi Xu, Caiyi Fei
{"title":"SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.","authors":"Yan Li, Boran Wang, Zengding Wu, Shiliang Ji, Shi Xu, Caiyi Fei","doi":"10.1142/S0219720024500227","DOIUrl":"10.1142/S0219720024500227","url":null,"abstract":"<p><p><i>Background:</i> Genetic mutations that cause the inactivation or aberrant activation of essential proteins may trigger alterations or even dysfunctions in cellular signaling pathways, culminating in the development of precancerous lesions and cancer. Mutations and such dysfunctions can result in the generation of \"novel proteins\" that are not part of the conventional human proteome. Identification of these proteins carries a profound potential for unraveling promising drug targets and designing innovative therapeutic models. Despite the emergence of diverse tools for detecting DNA or RNA variants, facilitated by the widespread adoption of nucleotide sequencing technology, these methods primarily target point mutations and exhibit suboptimal performance in detecting large-scale and combinatorial mutations. Additionally, the outcomes of these tools are confined to the genome and transcriptome levels, and do not provide the corresponding protein information resulting from genetic alterations. <i>Results:</i> We present the development of Sequencing Analysis Kit (SAKit), a bioinformatics pipeline for hybrid sequencing analysis integrating long-read and short-read RNA sequencing data. Long reads are utilized for detecting large-scale variations such as gene fusions, exon skipping, intron retention, and aberrant expression in non-coding regions, owing to their excellent coverage capabilities. Short reads serve to validate these findings at breakpoints and splice junctions. Conversely, short reads are employed for identifying small-scale variations, including single nucleotide variants, deletions, and insertions, due to their superior sequencing depth, with long reads providing additional validation. SAKit is designed to perform analyses using inter-species configuration files comprising genome references and annotation data, making it applicable to both human and mouse studies. Furthermore, SAKit implements a hierarchical filtering approach to eliminate low-confidence variants and employs open reading frame (ORF) analysis to translate identified variants into protein sequences. <i>Conclusion:</i> SAKit is a robust and versatile bioinformatics tool designed for the comprehensive identification of both large-scale and small-scale variants from RNA-seq data, facilitating the discovery of novel proteins. This pipeline integrates analysis of long-read and short-read sequencing data, offering a powerful solution for researchers in genomics and transcriptomics. SAKit is freely accessible and open-source, available through GitHub (https://github.com/therarna/SAKit) and as a Docker image https://hub.docker.com/repository/docker/therarna). Implemented primarily within a Snakemake framework using Python, SAKit ensures reproducibility, scalability, and ease of use for the scientific community.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 5","pages":"2450022"},"PeriodicalIF":0.9,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142688766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving drug-target interaction prediction through dual-modality fusion with InteractNet.","authors":"Baozhong Zhu, Runhua Zhang, Tengsheng Jiang, Zhiming Cui, Jing Chen, Hongjie Wu","doi":"10.1142/S0219720024500240","DOIUrl":"10.1142/S0219720024500240","url":null,"abstract":"<p><p>In the drug discovery process, accurate prediction of drug-target interactions is crucial to accelerate the development of new drugs. However, existing methods still face many challenges in dealing with complex biomolecular interactions. To this end, we propose a new deep learning framework that combines the structural information and sequence features of proteins to provide comprehensive feature representation through bimodal fusion. This framework not only integrates the topological adaptive graph convolutional network and multi-head attention mechanism, but also introduces a self-masked attention mechanism to ensure that each protein binding site can focus on its own unique features and its interaction with the ligand. Experimental results on multiple public datasets show that our method significantly outperforms traditional machine learning and graph neural network methods in predictive performance. In addition, our method can effectively identify and explain key molecular interactions, providing new insights into understanding the complex relationship between drugs and targets.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 5","pages":"2450024"},"PeriodicalIF":0.9,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of a multi-tissue compound-target interaction network of Qingfei Paidu decoction in COVID-19 treatment based on deep learning and transcriptomic analysis.","authors":"Xia Li, Xuetong Zhao, Xinjian Yu, Jianping Zhao, Xiangdong Fang","doi":"10.1142/S0219720024500161","DOIUrl":"10.1142/S0219720024500161","url":null,"abstract":"<p><p>The Qingfei Paidu decoction (QFPDD) is a widely acclaimed therapeutic formula employed nationwide for the clinical management of coronavirus disease 2019 (COVID-19). QFPDD exerts a synergistic therapeutic effect, characterized by its multi-component, multi-target, and multi-pathway action. However, the intricate interactions among the ingredients and targets within QFPDD and their systematic effects in multiple tissues remain undetermined. To address this, we qualitatively characterized the chemical components of QFPDD. We integrated multi-tissue transcriptomic analysis with GraphDTA, a deep learning model, to screen for potential compound-target interactions of QFPDD in multiple tissues. We predicted 13 key active compounds, 127 potential targets and 27 pathways associated with QFPDD across six different tissues. Notably, oleanolic acid-AXL exhibited leading affinity in the heart, blood, and liver. Molecular docking and molecular dynamics simulation confirmed their strong binding affinity. The robust interaction between oleanolic acid and the AXL receptor suggests that AXL is a promising target for developing clinical intervention strategies. Through the construction of a multi-tissue compound-target interaction network, our study further elucidated the mechanisms through which QFPDD effectively combats COVID-19 in multiple tissues. Our work also establishes a framework for future investigations into the systemic effects of other Traditional Chinese Medicine (TCM) formulas in disease treatment.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450016"},"PeriodicalIF":0.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Li, Zhifang Qi, Liang Liu, Mingzhu Lou, Shaobo Deng
{"title":"PCA-constrained multi-core matrix fusion network: A novel approach for cancer subtype identification.","authors":"Min Li, Zhifang Qi, Liang Liu, Mingzhu Lou, Shaobo Deng","doi":"10.1142/S0219720024500148","DOIUrl":"10.1142/S0219720024500148","url":null,"abstract":"<p><p>Cancer subtyping refers to categorizing a particular cancer type into distinct subtypes or subgroups based on a range of molecular characteristics, clinical manifestations, histological features, and other relevant factors. The identification of cancer subtypes can significantly enhance precision in clinical practice and facilitate personalized diagnosis and treatment strategies. Recent advancements in the field have witnessed the emergence of numerous network fusion methods aimed at identifying cancer subtypes. The majority of these fusion algorithms, however, solely rely on the fusion network of a single core matrix for the identification of cancer subtypes and fail to comprehensively capture similarity. To tackle this issue, in this study, we propose a novel cancer subtype recognition method, referred to as PCA-constrained multi-core matrix fusion network (PCA-MM-FN). The PCA-MM-FN algorithm initially employs three distinct methods to obtain three core matrices. Subsequently, the obtained core matrices are projected into a shared subspace using principal component analysis, followed by a weighted network fusion. Lastly, spectral clustering is conducted on the fused network. The results obtained from conducting experiments on the mRNA expression, DNA methylation, and miRNA expression of five TCGA datasets and three multi-omics benchmark datasets demonstrate that the proposed PCA-MM-FN approach exhibits superior accuracy in identifying cancer subtypes compared to the existing methods.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450014"},"PeriodicalIF":0.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V Abinas, U Abhinav, E M Haneem, A Vishnusankar, K A Abdul Nazeer
{"title":"Integration of autoencoder and graph convolutional network for predicting breast cancer drug response.","authors":"V Abinas, U Abhinav, E M Haneem, A Vishnusankar, K A Abdul Nazeer","doi":"10.1142/S0219720024500136","DOIUrl":"10.1142/S0219720024500136","url":null,"abstract":"<p><p><b>Background and objectives:</b> Breast cancer is the most prevalent type of cancer among women. The effectiveness of anticancer pharmacological therapy may get adversely affected by tumor heterogeneity that includes genetic and transcriptomic features. This leads to clinical variability in patient response to therapeutic drugs. Anticancer drug design and cancer understanding require precise identification of cancer drug responses. The performance of drug response prediction models can be improved by integrating multi-omics data and drug structure data. <b>Methods:</b> In this paper, we propose an Autoencoder (AE) and Graph Convolutional Network (AGCN) for drug response prediction, which integrates multi-omics data and drug structure data. Specifically, we first converted the high dimensional representation of each omic data to a lower dimensional representation using an AE for each omic data set. Subsequently, these individual features are combined with drug structure data obtained using a Graph Convolutional Network and given to a Convolutional Neural Network to calculate IC[Formula: see text] values for every combination of cell lines and drugs. Then a threshold IC[Formula: see text] value is obtained for each drug by performing K-means clustering of their known IC[Formula: see text] values. Finally, with the help of this threshold value, cell lines are classified as either sensitive or resistant to each drug. <b>Results:</b> Experimental results indicate that AGCN has an accuracy of 0.82 and performs better than many existing methods. In addition to that, we have done external validation of AGCN using data taken from The Cancer Genome Atlas (TCGA) clinical database, and we got an accuracy of 0.91. <b>Conclusion:</b> According to the results obtained, concatenating multi-omics data with drug structure data using AGCN for drug response prediction tasks greatly improves the accuracy of the prediction task.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 3","pages":"2450013"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}