Journal of Bioinformatics and Computational Biology最新文献_第2页

Prediction and annotation of alternative transcription starts and promoter shift in the chicken genome. 鸡基因组中替代转录起始和启动子移位的预测和注释。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-04-01 Epub Date: 2025-06-04 DOI: 10.1142/S0219720025500040

Valentina A Grushina, Ivan S Yevshin, Oleg A Gusev, Fedor A Kolpakov, Olga I Stanishevskaya, Elena S Fedorova, Natalia A Zinovieva, Sergey S Pintus

{"title":"Prediction and annotation of alternative transcription starts and promoter shift in the chicken genome.","authors":"Valentina A Grushina, Ivan S Yevshin, Oleg A Gusev, Fedor A Kolpakov, Olga I Stanishevskaya, Elena S Fedorova, Natalia A Zinovieva, Sergey S Pintus","doi":"10.1142/S0219720025500040","DOIUrl":"10.1142/S0219720025500040","url":null,"abstract":"Promoter shifting, characterized by alterations in Transcription Start Site (TSS) coordinates, is a well-documented phenomenon. The impact and statistical significance of promoter shifting can be assessed through analysis of Cap Analysis of Gene Expression (CAGE) data. This phenomenon is associated with developmental stage transitions, tissue differentiation, and cellular responses to environmental stimuli. Differential promoter usage suggests nonconstitutive expression of the regulated gene, indicative of focused promoter utilization. Conversely, housekeeping genes are typically characterized by stable expression levels driven by multiple dispersed promoters and are commonly expressed across a wide range of tissues. However, our findings demonstrate that many ubiquitously expressed genes utilize single, focused promoters and undergo significant promoter shifting, adding a layer of complexity to the definition of a housekeeping gene. Differential gene expression is commonly used to study gene responses to external stimuli in cells and tissues. Here, we employ an alternative approach based on differential promoter usage, identifying genes exhibiting significant promoter shifting as signatures of tissue response and phenotypic effects. Our results suggest that variations in chicken growth rate are regulated by nutrient metabolism rates, mediated through differential promoter usage of relevant genes.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 2","pages":"2550004"},"PeriodicalIF":0.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of clonal evolution in cancer: A computational perspective. 肿瘤克隆进化分析：计算视角。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-04-01 Epub Date: 2025-06-04 DOI: 10.1142/S0219720025310018

Paulo Henrique Ribeiro, Adenilso Simao

引用次数: 0

M³-20M: A large-scale multi-modal molecule dataset for AI-driven drug design and discovery. M3-20M：用于ai驱动的药物设计和发现的大规模多模态分子数据集。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-04-01 Epub Date: 2025-06-04 DOI: 10.1142/S0219720025500064

Siyuan Guo, Lexuan Wang, Chang Jin, Jinxian Wang, Han Peng, Huayang Shi, Wengen Li, Jihong Guan, Shuigeng Zhou

{"title":"M3-20M: A large-scale multi-modal molecule dataset for AI-driven drug design and discovery.","authors":"Siyuan Guo, Lexuan Wang, Chang Jin, Jinxian Wang, Han Peng, Huayang Shi, Wengen Li, Jihong Guan, Shuigeng Zhou","doi":"10.1142/S0219720025500064","DOIUrl":"10.1142/S0219720025500064","url":null,"abstract":"This paper introduces M3-20M, a large-scale Multi-Modal Molecule dataset that contains over 20 million molecules, with the data mainly being integrated from existing databases and partially generated by large language models. Designed to support AI-driven drug design and discovery, M3-20M is 71 times more in the number of molecules than the largest existing dataset, providing an unprecedented scale that can highly benefit the training or fine-tuning of models, including large language models for drug design and discovery tasks. This dataset integrates one-dimensional SMILES, two-dimensional molecular graphs, three-dimensional molecular structures, physicochemical properties, and textual descriptions collected through web crawling and generated using GPT-3.5, offering a comprehensive view of each molecule. To demonstrate the power of M3-20M in drug design and discovery, we conduct extensive experiments on two key tasks: molecule generation and molecular property prediction, using large language models including GLM4, GPT-3.5, GPT-4, and Llama3-8b. Our experimental results show that M3-20M can significantly boost model performance in both tasks. Specifically, it enables the models to generate more diverse and valid molecular structures and achieve higher property prediction accuracy than existing single-modal datasets, which validates the value and potential of M3-20M in supporting AI-driven drug design and discovery. The dataset is available at https://github.com/bz99bz/M-3.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 2","pages":"2550006"},"PeriodicalIF":0.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computation and analysis of stationary and periodic solutions of the COVID-19 infection dynamics model. COVID-19感染动力学模型平稳解和周期解的计算与分析。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 DOI: 10.1142/S0219720025400013

Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov

{"title":"Computation and analysis of stationary and periodic solutions of the COVID-19 infection dynamics model.","authors":"Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov","doi":"10.1142/S0219720025400013","DOIUrl":"10.1142/S0219720025400013","url":null,"abstract":"In this work, we search for the conditions for the occurrence of long COVID using the recently developed COVID-19 disease dynamics model which is a system of delay differential equations. To do so, we search for stable stationary or periodic solutions of this model with low viral load that can be interpreted as long COVID using our recently developed technology for analysing time-delay systems. The results of the bifurcation and sensitivity analysis of the mathematical model of SARS-CoV-2 infection suggest the following biological conclusions concerning the mechanisms of pathogenesis of long COVID-19. First, the possibility of SARS-CoV-2 persistence requires a 3-time reduction of the virus production rate per infected cell, or 18-times increase of the antibody-mediated elimination rate of free viruses as compared to an acute infection baseline estimates. Second, the loss of kinetic coordination between virus-induced type I IFN, antibody, and cytotoxic T lymphocyte (CTL) responses can result in the development of mild severity long-lasting infection. Third, the low-level persistent SARS-CoV-2 infection is robust to up to 100-fold perturbations (increase) in viral load and most sensitive to parameters of the humoral immune response.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 1","pages":"2540001"},"PeriodicalIF":0.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cross-cellular analysis of chromatin accessibility markers H3K4me3 and DNase in the context of detecting cell-identity genes: An "all-or-nothing" approach. 在检测细胞特征基因的背景下对染色质可及性标记物 H3K4me3 和 DNase 进行跨细胞分析：一种 "全有或全无 "的方法

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 DOI: 10.1142/S0219720025400025

Boon How Low, Kaushal Krishna Kaliskar, Stefano Perna, Bernett Lee

{"title":"Cross-cellular analysis of chromatin accessibility markers H3K4me3 and DNase in the context of detecting cell-identity genes: An \"all-or-nothing\" approach.","authors":"Boon How Low, Kaushal Krishna Kaliskar, Stefano Perna, Bernett Lee","doi":"10.1142/S0219720025400025","DOIUrl":"10.1142/S0219720025400025","url":null,"abstract":"Cell identity is often associated to a subset of highly-expressed genes that define the cell processes, as opposed to essential genes that are always active. Cell-specific genes may be defined in opposition to essential genes, or via experimental means. Detection of said cell-specific genes is often a primary goal in the study of novel biosamples. Chromatin accessibility markers (such as DNase and H3K4me3) help identify actively transcribed genes, but data can be difficult to come by for entirely novel biosamples. In this study, we investigate the possibility of associating the cell-specificity status of genes with chromatin accessibility markers from different cell lines, and we suggest that the number of cell lines in which a gene is found to be marked by DNase/H3K4me3 is predictive of the essentiality status itself. We define a measure called the Cross-cellular Chromatin Openness (CCO) level, and show that it is associated with the essentiality status using two differentiation experiments. We then compare the CCO-level predictive power to existing scRNA-Seq and bulk RNA-Seq methods, showing it has good concordance when applicable.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 1","pages":"2540002"},"PeriodicalIF":0.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SS-DTI: A deep learning method integrating semantic and structural information for drug-target interaction prediction. SS-DTI：一种整合语义和结构信息的深度学习方法，用于药物-靶标相互作用预测。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 Epub Date: 2025-03-25 DOI: 10.1142/S0219720025500027

Yujie Chun, Huaihu Li, Shunfang Wang

引用次数: 0

Drug repurposing for non-small cell lung cancer by predicting drug response using pathway-level graph convolutional network. 利用通路水平图卷积网络预测非小细胞肺癌药物反应的药物再利用。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 Epub Date: 2025-03-25 DOI: 10.1142/S0219720025500015

I T Anjusha, K A Abdul Nazeer, N Saleena

{"title":"Drug repurposing for non-small cell lung cancer by predicting drug response using pathway-level graph convolutional network.","authors":"I T Anjusha, K A Abdul Nazeer, N Saleena","doi":"10.1142/S0219720025500015","DOIUrl":"10.1142/S0219720025500015","url":null,"abstract":"Drug repurposing is the process of identifying new clinical indications for an existing drug. Some of the recent studies utilized drug response prediction models to identify drugs that can be repurposed. By representing cell-line features as a pathway-pathway interaction network, we can better understand the connections between cellular processes and drug response mechanisms. Existing deep learning models for drug response prediction do not integrate known biological pathway-pathway interactions into the model. This paper presents a drug response prediction model that applies a graph convolution operation on a pathway-pathway interaction network to represent features of cancer cell-lines effectively. The model is used to identify potential drug repurposing candidates for Non-Small Cell Lung Cancer (NSCLC). Experiment results show that the inclusion of graph convolutional model applied on a pathway-pathway interaction network makes the proposed model more effective in predicting drug response than the state-of-the-art methods. Specifically, the model has shown better performance in terms of Root Mean Squared Error, Coefficient of Determination, and Pearson's Correlation Coefficient when applied to the GDSC1000 dataset. Also, most of the drugs that the model predicted as top candidates for NSCLC treatment are either undergoing clinical studies or have some evidence in the PubMed literature database.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2550001"},"PeriodicalIF":0.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143711795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder. DDINet：基于多分子指纹特征和多头注意中心加权自编码器的药物-药物相互作用预测网络。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 DOI: 10.1142/S0219720025500039

K Soni Sharmila, Thanga Revathi S, Pokkuluri Kiran Sree

{"title":"DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.","authors":"K Soni Sharmila, Thanga Revathi S, Pokkuluri Kiran Sree","doi":"10.1142/S0219720025500039","DOIUrl":"10.1142/S0219720025500039","url":null,"abstract":"Drug-drug interactions (DDIs) pose a major concern in polypharmacy due to their potential to cause unexpected side effects that can adversely affect a patient's health. Therefore, it is crucial to identify DDIs effectively during the early stages of drug discovery and development. In this paper, a novel DDI prediction network (DDINet) is proposed to enhance the predictive performance over conventional DDI methods. Leveraging the DrugBank dataset, drugs are represented using the Simplified Molecular Input Line-Entry System (SMILES), with the RDKit software pre-processing the SMILES strings into their canonical forms. Multiple molecular fingerprinting techniques such as Extended Connectivity Fingerprints (ECFPs), Molecular ACCess System keys (MACCSkeys), PubChem Fingerprints, 3D molecular fingerprints (3D-FP), and molecular dynamics fingerprints (MDFPs) are employed to encode drug chemical structures into feature vectors. Drug similarities are computed using the Tanimoto coefficient (TC), and the final Structural Similarity Profile (SSP) is obtained by averaging the five molecular fingerprint types. The novelty of the approach lies in the integration of a Multi-head Attention centered Weighted Autoencoder (Mul_WAE) as the interaction prediction module, which leverages the Multi-head Attention (MHA) layer to focus on the most significant input features. Furthermore, we introduce the Upgraded Bald Eagle Search Optimization (UBesO) algorithm, which optimally selects the learnable parameters of the Mul_WAE based on cross-entropy loss, improving the model's convergence and performance. The proposed DDINet model achieves an accuracy of 99.77%, 99.66% of AUC, 99.5% average precision, 99.4% precision, and 99.49% recall, providing a comprehensive evaluation of the model's robustness. Beyond high accuracy, DDINet offers advantages in scalability, making it well suited for handling large datasets due to its efficient feature extraction and optimization processes. The unique combination of multiple molecular fingerprinting methods with the MHA layer and UBesO algorithm highlights the innovative aspects of our model and significantly improves prediction performance compared to existing approaches.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 1","pages":"2550003"},"PeriodicalIF":0.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gene regulatory network inference based on modified adaptive lasso. 基于改进自适应套索的基因调控网络推断。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2024-12-01 Epub Date: 2025-01-21 DOI: 10.1142/S0219720024500264

Chao Li, Xiaoran Huang, Xiao Luo, Xiaohui Lin

{"title":"Gene regulatory network inference based on modified adaptive lasso.","authors":"Chao Li, Xiaoran Huang, Xiao Luo, Xiaohui Lin","doi":"10.1142/S0219720024500264","DOIUrl":"10.1142/S0219720024500264","url":null,"abstract":"Gene regulatory networks (GRNs) reveal the regulatory interactions among genes and provide a visual tool to explain biological processes. However, how to identify direct relations among genes from gene expression data in the case of high-dimensional and small samples is a critical challenge. In this paper, we proposed a new GRN inference method based on a modified adaptive least absolute shrinkage and selection operator (MALasso). MALasso expands the number of samples based on the distance correlation and defines a new weighting manner for adaptive lasso to remove false positive edges of the networks in the iterative process. Simulated data and gene expression data from DREAM challenge were used to validate the performance of the proposed method MALasso. The comparison results among MALasso, adaptive lasso and other six state-of-the-art methods show that MALasso outperformed the competition methods in AUROCC and AUPRC in most cases and had a better ability to distinguish direct edges from indirect ones. Hence, by modifying the adaptive weighting manner of adaptive lasso, MALasso can detect linear and nonlinear relations, remove the false positive edges and identify direct relations among genes more accurately.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450026"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The use of 4D data-independent acquisition-based proteomic analysis and machine learning to reveal potential biomarkers for stress levels. 利用基于 4D 数据独立采集的蛋白质组分析和机器学习来揭示压力水平的潜在生物标志物。

IF 0.9 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2024-12-01 Epub Date: 2024-11-15 DOI: 10.1142/S0219720024500252

Dehua Chen, Yongsheng Yang, Dongdong Shi, Zhenhua Zhang, Mei Wang, Qiao Pan, Jianwen Su, Zhen Wang

{"title":"The use of 4D data-independent acquisition-based proteomic analysis and machine learning to reveal potential biomarkers for stress levels.","authors":"Dehua Chen, Yongsheng Yang, Dongdong Shi, Zhenhua Zhang, Mei Wang, Qiao Pan, Jianwen Su, Zhen Wang","doi":"10.1142/S0219720024500252","DOIUrl":"10.1142/S0219720024500252","url":null,"abstract":"Research suggests that individuals who experience prolonged exposure to stress may be at higher risk for developing psychological stress disorders. Currently, psychological stress is primarily evaluated by professional physicians using rating scales, which may be prone to subjective biases and limitations of the scales. Therefore, it is imperative to explore more objective, accurate, and efficient biomarkers for evaluating the level of psychological stress in an individual. In this study, we utilized 4D data-independent acquisition (4D-DIA) proteomics for quantitative protein analysis, and then employed support vector machine (SVM) combined with SHAP interpretation algorithm to identify potential biomarkers for psychological stress levels. Biomarkers validation was subsequently achieved through machine learning classification and a substantial amount of a priori knowledge derived from the knowledge graph. We performed cross-validation of the biomarkers using two batches of data, and the results showed that the combination of Glyceraldehyde-3-phosphate dehydrogenase and Fibronectin yielded an average area under the curve (AUC) of 92%, an average accuracy of 86%, an average F1 score of 79%, and an average sensitivity of 83%. Therefore, this combination may represent a potential approach for detecting stress levels to prevent psychological stress disorders.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450025"},"PeriodicalIF":0.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0