IEEE/ACM Transactions on Computational Biology and Bioinformatics最新文献_第6页

Relation Extraction in Biomedical Texts: A Cross-Sentence Approach 生物医学文本中的关系提取：跨句子方法

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-09-06 DOI: 10.1109/TCBB.2024.3451348

Zhijing Li;Liwei Tian;Yiping Jiang;Yucheng Huang

{"title":"Relation Extraction in Biomedical Texts: A Cross-Sentence Approach","authors":"Zhijing Li;Liwei Tian;Yiping Jiang;Yucheng Huang","doi":"10.1109/TCBB.2024.3451348","DOIUrl":"10.1109/TCBB.2024.3451348","url":null,"abstract":"Relation extraction, a crucial task in understanding the intricate relationships between entities in biomedical domains, has predominantly focused on binary relations within single sentences. However, in practical biomedical scenarios, relationships often extend across multiple sentences, leading to extraction errors with potential impacts on clinical decision-making and medical diagnosis. To overcome this limitation, we present a novel cross-sentence relation extraction framework that integrates and enhances coreference resolution and relation extraction models. Coreference resolution serves as the foundation, breaking sentence boundaries and linking entities across sentences. Our framework incorporates pre-trained deep language representations and leverages graph LSTMs to effectively model cross-sentence entity mentions. The use of a self-attentive Transformer architecture and external semantic information further enhances the modeling of intricate relationships. Comprehensive experiments conducted on two standard datasets, namely the BioNLP dataset and THYME dataset, demonstrate the state-of-the-art performance of our proposed approach.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2156-2166"},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CTsynther: Contrastive Transformer Model for End-to-End Retrosynthesis Prediction CTsynther：用于端到端逆合成预测的对比变换器模型。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-09-06 DOI: 10.1109/TCBB.2024.3455381

Hao Lu;Zhiqiang Wei;Kun Zhang;Xuze Wang;Liaqat Ali;Hao Liu

{"title":"CTsynther: Contrastive Transformer Model for End-to-End Retrosynthesis Prediction","authors":"Hao Lu;Zhiqiang Wei;Kun Zhang;Xuze Wang;Liaqat Ali;Hao Liu","doi":"10.1109/TCBB.2024.3455381","DOIUrl":"10.1109/TCBB.2024.3455381","url":null,"abstract":"Retrosynthesis prediction is a fundamental problem in organic chemistry and drug synthesis. We proposed an end-to-end deep learning model called CTsynther (Contrastive Transformer for single-step retrosynthesis prediction model) that could provide single-step retrosynthesis prediction without external reaction templates or specialized knowledge. The model introduced the concept of contrastive learning in Transformer architecture and employed a contrastive learning language representation model at the SMILES sentence level to enhance model inference by learning similarities and differences between various samples. Mixed global and local attention mechanisms allow the model to capture features and dependencies between different atoms to improve generalization. We further investigated the embedding representations of SMILES learned automatically from the model. Visualization results show that the model could effectively acquire information about identical molecules and improve prediction performance. Experiments showed that the accuracy of retrosynthesis reached 53.5% and 64.4% for with and without reaction types, respectively. The validity of the predicted reactants is improved, showing competitiveness compared with semi-template methods.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2235-2245"},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-09-03 DOI: 10.1109/TCBB.2024.3453499

Bin Liu;Grigorios Tsoumakas

{"title":"Integrating Similarities via Local Interaction Consistency and Optimizing Area Under the Curve Measures via Matrix Factorization for Drug-Target Interaction Prediction","authors":"Bin Liu;Grigorios Tsoumakas","doi":"10.1109/TCBB.2024.3453499","DOIUrl":"10.1109/TCBB.2024.3453499","url":null,"abstract":"In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure. Computational methods efficiently predict DTIs and recommend a small part of potential interacting pairs for further experimental confirmation, accelerating the drug discovery process. Although fusing heterogeneous drug and target similarities can improve the prediction ability, the existing similarity combination methods ignore the interaction consistency for neighbour entities. Furthermore, area under the precision-recall curve (AUPR) and area under the receiver operating characteristic curve (AUC) are two widely used evaluation metrics in DTI prediction. However, the two metrics are seldom considered as losses within existing DTI prediction methods. We propose a local interaction consistency (LIC) aware similarity integration method to fuse vital information from diverse views for DTI prediction models. Furthermore, we propose two matrix factorization (MF) methods that optimize AUPR and AUC using convex surrogate losses respectively, and then develop an ensemble MF approach that takes advantage of the two area under the curve metrics by combining the two single metric based MF models. Experimental results under different prediction settings show that the proposed methods outperform various competitors in terms of the metric(s) they optimize and are reliable in discovering potential new DTIs.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2212-2225"},"PeriodicalIF":3.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LKLPDA: A Low-Rank Fast Kernel Learning Approach for Predicting piRNA-Disease Associations LKLPDA：用于预测 piRNA 与疾病关联的低链快速核学习方法

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-30 DOI: 10.1109/TCBB.2024.3452055

Qingzhou Shi;Kai Zheng;Haoyuan Li;Bo Wang;Xiao Liang;Xinyu Li;Jianxin Wang

{"title":"LKLPDA: A Low-Rank Fast Kernel Learning Approach for Predicting piRNA-Disease Associations","authors":"Qingzhou Shi;Kai Zheng;Haoyuan Li;Bo Wang;Xiao Liang;Xinyu Li;Jianxin Wang","doi":"10.1109/TCBB.2024.3452055","DOIUrl":"10.1109/TCBB.2024.3452055","url":null,"abstract":"Piwi-interacting RNAs (piRNAs) are increasingly recognized as potential biomarkers for various diseases. Investig-ating the complex relationship between piRNAs and diseases through computational methods can reduce the costs and risks associated with biological experiments. Fast kernel learning (FKL) is a classical method for multi-source data fusion that is widely employed in association prediction research. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper the effectiveness of the network-based ideal kernel. The conventional FKL method does not address this issue. In this study, we propose a low-rank fast kernel learning (LRFKL) algorithm, which consists of low-rank representation (LRR) and the FKL algorithm. The LRFKL algorithm is designed to mitigate the effects of noise on the network-based ideal kernel. Using LRFKL, we propose a novel approach for predicting piRNA-disease associations called LKLPDA. Specifically, we first compute the similarity matrices for piRNAs and diseases. Then we use the LRFKL to fuse the similarity matrices for piRNAs and diseases separately. Finally, the LKLPDA employs AutoGluon-Tabular for predictive analysis. Computational results show that LKLPDA effectively predicts piRNA-disease associations with higher accuracy compared to previous methods. In addition, case studies confirm the reliability of the model in predicting piRNA-disease associations.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2179-2187"},"PeriodicalIF":3.6,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142106990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MMD-DTA: A Multi-Modal Deep Learning Framework for Drug-Target Binding Affinity and Binding Region Prediction MMD-DTA：用于药物与目标结合亲和力和结合区域预测的多模态深度学习框架。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-29 DOI: 10.1109/TCBB.2024.3451985

Qi Zhang;Yuxiao Wei;Bo Liao;Liwei Liu;Shengli Zhang

{"title":"MMD-DTA: A Multi-Modal Deep Learning Framework for Drug-Target Binding Affinity and Binding Region Prediction","authors":"Qi Zhang;Yuxiao Wei;Bo Liao;Liwei Liu;Shengli Zhang","doi":"10.1109/TCBB.2024.3451985","DOIUrl":"10.1109/TCBB.2024.3451985","url":null,"abstract":"The prediction of drug-target affinity (DTA) plays a crucial role in drug development and the identification of potential drug targets. In recent years, computer-assisted DTA prediction has emerged as a significant approach in this field. In this study, we propose a multi-modal deep learning framework called MMD-DTA for predicting drug-target binding affinity and binding regions. The model can predict DTA while simultaneously learning the binding regions of drug-target interactions through unsupervised learning. To achieve this, MMD-DTA first uses graph neural networks and target structural feature extraction network to extract multi-modal information from the sequences and structures of drugs and targets. It then utilizes the feature interaction and fusion modules to generate interaction descriptors for predicting DTA and interaction strength for binding region prediction. Our experimental results demonstrate that MMD-DTA outperforms existing models based on key evaluation metrics. Furthermore, external validation results indicate that MMD-DTA enhances the generalization capability of the model by integrating sequence and structural information of drugs and targets. The model trained on the benchmark dataset can effectively generalize to independent virtual screening tasks. The visualization of drug-target binding region prediction showcases the interpretability of MMD-DTA, providing valuable insights into the functional regions of drug molecules that interact with proteins.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2200-2211"},"PeriodicalIF":3.6,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142106991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contrasting Multi-Source Temporal Knowledge Graphs for Biomedical Hypothesis Generation 用于生物医学假设生成的多源时态知识图谱对比。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-28 DOI: 10.1109/TCBB.2024.3451051

Huiwei Zhou;Wenchu Li;Weihong Yao;Yingyu Lin;Lei Du

{"title":"Contrasting Multi-Source Temporal Knowledge Graphs for Biomedical Hypothesis Generation","authors":"Huiwei Zhou;Wenchu Li;Weihong Yao;Yingyu Lin;Lei Du","doi":"10.1109/TCBB.2024.3451051","DOIUrl":"10.1109/TCBB.2024.3451051","url":null,"abstract":"Hypothesis Generation (HG) aims to expedite biomedical researches by generating novel hypotheses from existing scientific literature. Most existing studies focused on modeling static snapshots of the corpus, neglecting the temporal evolution of scientific terms. Despite recent efforts to learn term evolution from Knowledge Bases (KBs) for HG, the temporal information from multi-source KBs is still overlooked, which contains important, up-to-date knowledge. In this paper, an innovative Temporal Contrastive Learning (TCL) framework is introduced to uncover latent associations between entities by jointly modeling their co-evolution across multi-source temporal KBs. Specifically, we first construct a temporal relation graph based on PubMed papers and a biomedical relation database (such as Comparative Toxicogenomics Database (CTD)). Then the constructed temporal relation graph and a temporal concept graph (such as Medical Subject Headings (MeSH)) are used to train two GCN-based recurrent networks for learning the entity temporal evolutional embeddings, respectively. Finally, a cross-view temporal prediction task is designed for learning knowledge enriched temporal embeddings by contrasting the temporal embeddings learned from the two Temporal Knowledge Graphs (TKGs). Findings from experiments conducted on three real-world biomedical term relationship datasets demonstrate that the proposed approach is clearly superior to approaches based on single TKG, achieving the state-of-the-art performance.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2102-2112"},"PeriodicalIF":3.6,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142086095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Compact Class-Conditional Attribute Category Clustering: Amino Acid Grouping for Enhanced HIV-1 Protease Cleavage Classification 紧凑型类条件属性类别聚类：用于增强 HIV-1 蛋白酶裂解分类的氨基酸分组。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-23 DOI: 10.1109/TCBB.2024.3448617

José A. Sáez;J. Fernando Vera

{"title":"Compact Class-Conditional Attribute Category Clustering: Amino Acid Grouping for Enhanced HIV-1 Protease Cleavage Classification","authors":"José A. Sáez;J. Fernando Vera","doi":"10.1109/TCBB.2024.3448617","DOIUrl":"10.1109/TCBB.2024.3448617","url":null,"abstract":"Categorical attributes are common in many classification tasks, presenting certain challenges as the number of categories grows. This situation can affect data handling, negatively impacting the building time of models, their complexity and, ultimately, their classification performance. In order to mitigate these issues, this research proposes a novel preprocessing technique for grouping attribute categories in classification datasets. This approach combines the exact representation of the association between categorical values in a Euclidean space, clustering methods and attribute quality metrics to group similar attribute categories based on their contribution to the classification task. To estimate its effectiveness, the proposal is evaluated within the context of HIV-1 protease cleavage site prediction, where each attribute represents an amino acid that can take multiple possible values. The results obtained on HIV-1 real-world datasets show a significant reduction in the number of categories per attribute, with an average reduction percentage ranging from 74% to 81%. This reduction leads to simplified data representations and improved classification performances compared to not preprocessing. Specifically, improvements of up to 0.07 in accuracy and 0.19 in geometric mean are observed across different datasets and classification algorithms. Additionally, extensive simulations on synthetic datasets with varied characteristics are carried out, providing consistent and reliable results that validate the robustness of the proposal. These findings highlight the capability of the developed method to enhance cleavage prediction, which could potentially contribute to understanding viral processes and developing targeted therapeutic strategies.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2167-2178"},"PeriodicalIF":3.6,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10645313","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142043936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Method for Inferring Polymers Based on Linear Regression and Integer Programming 基于线性回归和整数编程的聚合物推断方法。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-22 DOI: 10.1109/TCBB.2024.3447780

Ryota Ido;Shengjuan Cao;Jianshen Zhu;Naveed Ahmed Azam;Kazuya Haraguchi;Liang Zhao;Hiroshi Nagamochi;Tatsuya Akutsu

引用次数: 0

KGRACDA: A Model Based on Knowledge Graph from Recursion and Attention Aggregation for CircRNA-Disease Association Prediction KGRACDA：基于知识图谱的递归和注意力聚合的 CircRNA-疾病关联预测模型

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-21 DOI: 10.1109/TCBB.2024.3447110

Ying Wang;Maoyuan Ma;Yanxin Xie;Qinke Peng;Hongqiang Lyu;Hequan Sun;Laiyi Fu

{"title":"KGRACDA: A Model Based on Knowledge Graph from Recursion and Attention Aggregation for CircRNA-Disease Association Prediction","authors":"Ying Wang;Maoyuan Ma;Yanxin Xie;Qinke Peng;Hongqiang Lyu;Hequan Sun;Laiyi Fu","doi":"10.1109/TCBB.2024.3447110","DOIUrl":"10.1109/TCBB.2024.3447110","url":null,"abstract":"CircRNA is closely related to human disease, so it is important to predict circRNA-disease association (CDA). However, the traditional biological detection methods have high difficulty and low accuracy, and computational methods represented by deep learning ignore the ability of the model to explicitly extract local depth information of the CDA. We propose a model based on knowledge graph from recursion and attention aggregation for circRNA-disease association prediction (KGRACDA). This model combines explicit structural features and implicit embedding information of graphs, optimizing graph embedding vectors. First, we built large-scale, multi-source heterogeneous datasets and construct a knowledge graph of multiple RNAs and diseases. After that, we use a recursive method to build multi-hop subgraphs and optimize graph attention mechanism by gating mechanism, mining local depth information. At the same time, the model uses multi-head attention mechanism to balance global and local depth features of graphs, and generate CDA prediction scores. KGRACDA surpasses other methods by capturing local and global depth information related to CDA. We update an interactive web platform HNRBase v2.0, which visualizes circRNA data, and allows users to download data and predict CDA using model.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2133-2144"},"PeriodicalIF":3.6,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142017376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Parallel Convolutional Contrastive Learning Method for Enzyme Function Prediction 用于酶功能预测的并行卷积对比学习方法。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-08-21 DOI: 10.1109/TCBB.2024.3447037

Xindi Yu;Shusen Zhou;Mujun Zang;Qingjun Wang;Chanjuan Liu;Tong Liu

{"title":"Parallel Convolutional Contrastive Learning Method for Enzyme Function Prediction","authors":"Xindi Yu;Shusen Zhou;Mujun Zang;Qingjun Wang;Chanjuan Liu;Tong Liu","doi":"10.1109/TCBB.2024.3447037","DOIUrl":"10.1109/TCBB.2024.3447037","url":null,"abstract":"The function labeling of enzymes has a wide range of application value in the medical field, industrial biology and other fields. Scientists define enzyme categories by enzyme commission (EC) numbers. At present, although there are some tools for enzyme function prediction, their effects have not reached the application level. To improve the precision of enzyme function prediction, we propose a parallel convolutional contrastive learning (PCCL) method to predict enzyme functions. First, we use the advanced protein language model ESM-2 to preprocess the protein sequences. Second, PCCL combines convolutional neural networks (CNNs) and contrastive learning to improve the prediction precision of multifunctional enzymes. Contrastive learning can make the model better deal with the problem of class imbalance. Finally, the deep learning framework is mainly composed of three parallel CNNs for fully extracting sample features. we compare PCCL with state-of-art enzyme function prediction methods based on three evaluation metrics. The performance of our model improves on both two test sets. Especially on the smaller test set, PCCL improves the AUC by 2.57%.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2604-2609"},"PeriodicalIF":3.6,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142017377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0