Frontiers in bioinformatics最新文献

筛选
英文 中文
A breast cancer-specific combinational QSAR model development using machine learning and deep learning approaches. 利用机器学习和深度学习方法开发乳腺癌特异性组合 QSAR 模型。
IF 2.8
Frontiers in bioinformatics Pub Date : 2024-01-15 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1328262
Anush Karampuri, Shyam Perugu
{"title":"A breast cancer-specific combinational QSAR model development using machine learning and deep learning approaches.","authors":"Anush Karampuri, Shyam Perugu","doi":"10.3389/fbinf.2023.1328262","DOIUrl":"10.3389/fbinf.2023.1328262","url":null,"abstract":"<p><p>Breast cancer is the most prevalent and heterogeneous form of cancer affecting women worldwide. Various therapeutic strategies are in practice based on the extent of disease spread, such as surgery, chemotherapy, radiotherapy, and immunotherapy. Combinational therapy is another strategy that has proven to be effective in controlling cancer progression. Administration of Anchor drug, a well-established primary therapeutic agent with known efficacy for specific targets, with Library drug, a supplementary drug to enhance the efficacy of anchor drugs and broaden the therapeutic approach. Our work focused on harnessing regression-based Machine learning (ML) and deep learning (DL) algorithms to develop a structure-activity relationship between the molecular descriptors of drug pairs and their combined biological activity through a QSAR (Quantitative structure-activity relationship) model. 11 popularly known machine learning and deep learning algorithms were used to develop QSAR models. A total of 52 breast cancer cell lines, 25 anchor drugs, and 51 library drugs were considered in developing the QSAR model. It was observed that Deep Neural Networks (DNNs) achieved an impressive R<sup>2</sup> (Coefficient of Determination) of 0.94, with an RMSE (Root Mean Square Error) value of 0.255, making it the most effective algorithm for developing a structure-activity relationship with strong generalization capabilities. In conclusion, applying combinational therapy alongside ML and DL techniques represents a promising approach to combating breast cancer.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1328262"},"PeriodicalIF":2.8,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10822965/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139577087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BPAGS: a web application for bacteriocin prediction via feature evaluation using alternating decision tree, genetic algorithm, and linear support vector classifier BPAGS:利用交替决策树、遗传算法和线性支持向量分类器,通过特征评估进行细菌素预测的网络应用程序
Frontiers in bioinformatics Pub Date : 2024-01-10 DOI: 10.3389/fbinf.2023.1284705
Suraiya Akhter, John H. Miller
{"title":"BPAGS: a web application for bacteriocin prediction via feature evaluation using alternating decision tree, genetic algorithm, and linear support vector classifier","authors":"Suraiya Akhter, John H. Miller","doi":"10.3389/fbinf.2023.1284705","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1284705","url":null,"abstract":"The use of bacteriocins has emerged as a propitious strategy in the development of new drugs to combat antibiotic resistance, given their ability to kill bacteria with both broad and narrow natural spectra. Hence, a compelling requirement arises for a precise and efficient computational model that can accurately predict novel bacteriocins. Machine learning’s ability to learn patterns and features from bacteriocin sequences that are difficult to capture using sequence matching-based methods makes it a potentially superior choice for accurate prediction. A web application for predicting bacteriocin was created in this study, utilizing a machine learning approach. The feature sets employed in the application were chosen using alternating decision tree (ADTree), genetic algorithm (GA), and linear support vector classifier (linear SVC)-based feature evaluation methods. Initially, potential features were extracted from the physicochemical, structural, and sequence-profile attributes of both bacteriocin and non-bacteriocin protein sequences. We assessed the candidate features first using the Pearson correlation coefficient, followed by separate evaluations with ADTree, GA, and linear SVC to eliminate unnecessary features. Finally, we constructed random forest (RF), support vector machine (SVM), decision tree (DT), logistic regression (LR), k-nearest neighbors (KNN), and Gaussian naïve Bayes (GNB) models using reduced feature sets. We obtained the overall top performing model using SVM with ADTree-reduced features, achieving an accuracy of 99.11% and an AUC value of 0.9984 on the testing dataset. We also assessed the predictive capabilities of our best-performing models for each reduced feature set relative to our previously developed software solution, a sequence alignment-based tool, and a deep-learning approach. A web application, titled BPAGS (Bacteriocin Prediction based on ADTree, GA, and linear SVC), was developed to incorporate the predictive models built using ADTree, GA, and linear SVC-based feature sets. Currently, the web-based tool provides classification results with associated probability values and has options to add new samples in the training data to improve the predictive efficacy. BPAGS is freely accessible at https://shiny.tricities.wsu.edu/bacteriocin-prediction/.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"8 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139439460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-boundary thinking for artificial intelligence in bioinformatics and education 生物信息学和教育领域人工智能的无边界思维
Frontiers in bioinformatics Pub Date : 2024-01-08 DOI: 10.3389/fbinf.2023.1332902
Prajay Patel, Nisha Pillai, Inimary T. Toby
{"title":"No-boundary thinking for artificial intelligence in bioinformatics and education","authors":"Prajay Patel, Nisha Pillai, Inimary T. Toby","doi":"10.3389/fbinf.2023.1332902","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1332902","url":null,"abstract":"No-boundary thinking enables the scientific community to reflect in a thoughtful manner and discover new opportunities, create innovative solutions, and break through barriers that might have otherwise constrained their progress. This concept encourages thinking without being confined by traditional rules, limitations, or established norms, and a mindset that is not limited by previous work, leading to fresh perspectives and innovative outcomes. So, where do we see the field of artificial intelligence (AI) in bioinformatics going in the next 30 years? That was the theme of a “No-Boundary Thinking” Session as part of the Mid-South Computational Bioinformatics Society’s (MCBIOS) 19th annual meeting in Irving, Texas. This session addressed various areas of AI in an open discussion and raised some perspectives on how popular tools like ChatGPT can be integrated into bioinformatics, communicating with scientists in different fields to properly utilize the potential of these algorithms, and how to continue educational outreach to further interest of data science and informatics to the next-generation of scientists.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"49 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139448061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Protein-lipid interactions and protein anchoring modulate the modes of association of the globular domain of the Prion protein and Doppel protein to model membrane patches 蛋白-脂质相互作用和蛋白锚定调节朊病毒蛋白和多肽蛋白的球状结构域与模型膜片的结合模式
Frontiers in bioinformatics Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1321287
Patricia Soto, Davis T. Thalhuber, Frank Luceri, Jamie Janos, Mason R. Borgman, Noah M. Greenwood, Sofia Acosta, Hunter Stoffel
{"title":"Protein-lipid interactions and protein anchoring modulate the modes of association of the globular domain of the Prion protein and Doppel protein to model membrane patches","authors":"Patricia Soto, Davis T. Thalhuber, Frank Luceri, Jamie Janos, Mason R. Borgman, Noah M. Greenwood, Sofia Acosta, Hunter Stoffel","doi":"10.3389/fbinf.2023.1321287","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1321287","url":null,"abstract":"The Prion protein is the molecular hallmark of the incurable prion diseases affecting mammals, including humans. The protein-only hypothesis states that the misfolding, accumulation, and deposition of the Prion protein play a critical role in toxicity. The cellular Prion protein (PrPC) anchors to the extracellular leaflet of the plasma membrane and prefers cholesterol- and sphingomyelin-rich membrane domains. Conformational Prion protein conversion into the pathological isoform happens on the cell surface. In vitro and in vivo experiments indicate that Prion protein misfolding, aggregation, and toxicity are sensitive to the lipid composition of plasma membranes and vesicles. A picture of the underlying biophysical driving forces that explain the effect of Prion protein - lipid interactions in physiological conditions is needed to develop a structural model of Prion protein conformational conversion. To this end, we use molecular dynamics simulations that mimic the interactions between the globular domain of PrPC anchored to model membrane patches. In addition, we also simulate the Doppel protein anchored to such membrane patches. The Doppel protein is the closest in the phylogenetic tree to PrPC, localizes in an extracellular milieu similar to that of PrPC, and exhibits a similar topology to PrPC even if the amino acid sequence is only 25% identical. Our simulations show that specific protein-lipid interactions and conformational constraints imposed by GPI anchoring together favor specific binding sites in globular PrPC but not in Doppel. Interestingly, the binding sites we found in PrPC correspond to prion protein loops, which are critical in aggregation and prion disease transmission barrier (β2-α2 loop) and in initial spontaneous misfolding (α2-α3 loop). We also found that the membrane re-arranges locally to accommodate protein residues inserted in the membrane surface as a response to protein binding.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"39 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139381635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RxNorm for drug name normalization: a case study of prescription opioids in the FDA adverse events reporting system 用于药品名称规范化的 RxNorm:FDA 不良事件报告系统中处方类阿片的案例研究
Frontiers in bioinformatics Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1328613
Huyen Le, Ru Chen, Stephen Harris, Hong Fang, Beverly Lyn-Cook, H. Hong, W. Ge, Paul Rogers, Weida Tong, Wen Zou
{"title":"RxNorm for drug name normalization: a case study of prescription opioids in the FDA adverse events reporting system","authors":"Huyen Le, Ru Chen, Stephen Harris, Hong Fang, Beverly Lyn-Cook, H. Hong, W. Ge, Paul Rogers, Weida Tong, Wen Zou","doi":"10.3389/fbinf.2023.1328613","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1328613","url":null,"abstract":"Numerous studies have been conducted on the US Food and Drug Administration (FDA) Adverse Events Reporting System (FAERS) database to assess post-marketing reporting rates for drug safety review and risk assessment. However, the drug names in the adverse event (AE) reports from FAERS were heterogeneous due to a lack of uniformity of information submitted mandatorily by pharmaceutical companies and voluntarily by patients, healthcare professionals, and the public. Studies using FAERS and other spontaneous reporting AEs database without drug name normalization may encounter incomplete collection of AE reports from non-standard drug names and the accuracies of the results might be impacted. In this study, we demonstrated applicability of RxNorm, developed by the National Library of Medicine, for drug name normalization in FAERS. Using prescription opioids as a case study, we used RxNorm application program interface (API) to map all FDA-approved prescription opioids described in FAERS AE reports to their equivalent RxNorm Concept Unique Identifiers (RxCUIs) and RxNorm names. The different names of the opioids were then extracted, and their usage frequencies were calculated in collection of more than 14.9 million AE reports for 13 FDA-approved prescription opioid classes, reported over 17 years. The results showed that a significant number of different names were consistently used for opioids in FAERS reports, with 2,086 different names (out of 7,892) used at least three times and 842 different names used at least ten times for each of the 92 RxNorm names of FDA-approved opioids. Our method of using RxNorm API mapping was confirmed to be efficient and accurate and capable of reducing the heterogeneity of prescription opioid names significantly in the AE reports in FAERS; meanwhile, it is expected to have a broad application to different sets of drug names from any database where drug names are diverse and unnormalized. It is expected to be able to automatically standardize and link different representations of the same drugs to build an intact and high-quality database for diverse research, particularly postmarketing data analysis in pharmacovigilance initiatives.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"48 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139383606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Expert opinions in protein bioinformatics: 2022 社论:蛋白质生物信息学专家意见:2022 年
Frontiers in bioinformatics Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1338560
Daisuke Kihara
{"title":"Editorial: Expert opinions in protein bioinformatics: 2022","authors":"Daisuke Kihara","doi":"10.3389/fbinf.2023.1338560","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1338560","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"50 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139383582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic risk prediction of cardiovascular diseases among type 2 diabetes patients in the UK Biobank 英国生物库中 2 型糖尿病患者心血管疾病的基因组风险预测
Frontiers in bioinformatics Pub Date : 2024-01-04 DOI: 10.3389/fbinf.2023.1320748
Yixuan Ye, Jiaqi Hu, Fuyuan Pang, Can Cui, Hongyu Zhao
{"title":"Genomic risk prediction of cardiovascular diseases among type 2 diabetes patients in the UK Biobank","authors":"Yixuan Ye, Jiaqi Hu, Fuyuan Pang, Can Cui, Hongyu Zhao","doi":"10.3389/fbinf.2023.1320748","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1320748","url":null,"abstract":"Background: Polygenic risk score (PRS) has proved useful in predicting the risk of cardiovascular diseases (CVD) based on the genotypes of an individual, but most analyses have focused on disease onset in the general population. The usefulness of PRS to predict CVD risk among type 2 diabetes (T2D) patients remains unclear.Methods: We built a meta-PRSCVD upon the candidate PRSs developed from state-of-the-art PRS methods for three CVD subtypes of significant importance: coronary artery disease (CAD), ischemic stroke (IS), and heart failure (HF). To evaluate the prediction performance of the meta-PRSCVD, we restricted our analysis to 21,092 white British T2D patients in the UK Biobank, among which 4,015 had CVD events.Results: Results showed that the meta-PRSCVD was significantly associated with CVD risk with a hazard ratio per standard deviation increase of 1.28 (95% CI: 1.23–1.33). The meta-PRSCVD alone predicted the CVD incidence with an area under the receiver operating characteristic curve (AUC) of 0.57 (95% CI: 0.54–0.59). When restricted to the early-onset patients (onset age ≤ 55), the AUC was further increased to 0.61 (95% CI 0.56–0.67).Conclusion: Our results highlight the potential role of genomic screening for secondary preventions of CVD among T2D patients, especially among early-onset patients.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"59 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying epigenetic aging moderators using the epigenetic pacemaker 利用表观遗传起搏器识别表观遗传衰老调节器
Frontiers in bioinformatics Pub Date : 2024-01-03 DOI: 10.3389/fbinf.2023.1308680
Colin Farrell, Chanyue Hu, Kalsuda Lapborisuth, Kyle Pu, S. Snir, Matteo Pellegrini
{"title":"Identifying epigenetic aging moderators using the epigenetic pacemaker","authors":"Colin Farrell, Chanyue Hu, Kalsuda Lapborisuth, Kyle Pu, S. Snir, Matteo Pellegrini","doi":"10.3389/fbinf.2023.1308680","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1308680","url":null,"abstract":"Epigenetic clocks are DNA methylation-based chronological age prediction models that are commonly employed to study age-related biology. The difference between the predicted and observed age is often interpreted as a form of biological age acceleration, and many studies have measured the impact of environmental and disease-associated factors on epigenetic age. Most epigenetic clocks are fit using approaches that minimize the error between the predicted and observed chronological age, and as a result, they may not accurately model the impact of factors that moderate the relationship between the actual and epigenetic age. Here, we compare epigenetic clocks that are constructed using penalized regression methods to an evolutionary framework of epigenetic aging with the epigenetic pacemaker (EPM), which directly models DNA methylation as a function of a time-dependent epigenetic state. In simulations, we show that the value of the epigenetic state is impacted by factors such as age, sex, and cell-type composition. Next, in a dataset aggregated from previous studies, we show that the epigenetic state is also moderated by sex and the cell type. Finally, we demonstrate that the epigenetic state is also moderated by toxins in a study on polybrominated biphenyl exposure. Thus, we find that the pacemaker provides a robust framework for the study of factors that impact epigenetic age acceleration and that the effect of these factors may be obscured in traditional clocks based on linear regression models.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"47 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139451050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving fluorescence lifetime imaging microscopy phasor accuracy using convolutional neural networks 利用卷积神经网络提高荧光寿命成像显微镜相位精度
Frontiers in bioinformatics Pub Date : 2023-12-22 DOI: 10.3389/fbinf.2023.1335413
Varun Mannam, Jacob P. Brandt, Cody J. Smith, Xiaotong Yuan, S. Howard
{"title":"Improving fluorescence lifetime imaging microscopy phasor accuracy using convolutional neural networks","authors":"Varun Mannam, Jacob P. Brandt, Cody J. Smith, Xiaotong Yuan, S. Howard","doi":"10.3389/fbinf.2023.1335413","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1335413","url":null,"abstract":"Introduction: Although a powerful biological imaging technique, fluorescence lifetime imaging microscopy (FLIM) faces challenges such as a slow acquisition rate, a low signal-to-noise ratio (SNR), and high cost and complexity. To address the fundamental problem of low SNR in FLIM images, we demonstrate how to use pre-trained convolutional neural networks (CNNs) to reduce noise in FLIM measurements.Methods: Our approach uses pre-learned models that have been previously validated on large datasets with different distributions than the training datasets, such as sample structures, noise distributions, and microscopy modalities in fluorescence microscopy, to eliminate the need to train a neural network from scratch or to acquire a large training dataset to denoise FLIM data. In addition, we are using the pre-trained networks in the inference stage, where the computation time is in milliseconds and accuracy is better than traditional denoising methods. To separate different fluorophores in lifetime images, the denoised images are then run through an unsupervised machine learning technique named “K-means clustering”.Results and Discussion: The results of the experiments carried out on in vivo mouse kidney tissue, Bovine pulmonary artery endothelial (BPAE) fixed cells that have been fluorescently labeled, and mouse kidney fixed samples that have been fluorescently labeled show that our demonstrated method can effectively remove noise from FLIM images and improve segmentation accuracy. Additionally, the performance of our method on out-of-distribution highly scattering in vivo plant samples shows that it can also improve SNR in challenging imaging conditions. Our proposed method provides a fast and accurate way to segment fluorescence lifetime images captured using any FLIM system. It is especially effective for separating fluorophores in noisy FLIM images, which is common in in vivo imaging where averaging is not applicable. Our approach significantly improves the identification of vital biologically relevant structures in biomedical imaging applications.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138944777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention network for predicting T-cell receptor-peptide binding can associate attention with interpretable protein structural properties. 用于预测 T 细胞受体与多肽结合的注意力网络可将注意力与可解释的蛋白质结构特性联系起来。
Frontiers in bioinformatics Pub Date : 2023-12-18 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1274599
Kyohei Koyama, Kosuke Hashimoto, Chioko Nagao, Kenji Mizuguchi
{"title":"Attention network for predicting T-cell receptor-peptide binding can associate attention with interpretable protein structural properties.","authors":"Kyohei Koyama, Kosuke Hashimoto, Chioko Nagao, Kenji Mizuguchi","doi":"10.3389/fbinf.2023.1274599","DOIUrl":"10.3389/fbinf.2023.1274599","url":null,"abstract":"<p><p>Understanding how a T-cell receptor (TCR) recognizes its specific ligand peptide is crucial for gaining an insight into biological functions and disease mechanisms. Despite its importance, experimentally determining TCR-peptide-major histocompatibility complex (TCR-pMHC) interactions is expensive and time-consuming. To address this challenge, computational methods have been proposed, but they are typically evaluated by internal retrospective validation only, and few researchers have incorporated and tested an attention layer from language models into structural information. Therefore, in this study, we developed a machine learning model based on a modified version of Transformer, a source-target attention neural network, to predict the TCR-pMHC interaction solely from the amino acid sequences of the TCR complementarity-determining region (CDR) 3 and the peptide. This model achieved competitive performance on a benchmark dataset of the TCR-pMHC interaction, as well as on a truly new external dataset. Additionally, by analyzing the results of binding predictions, we associated the neural network weights with protein structural properties. By classifying the residues into large- and small-attention groups, we identified statistically significant properties associated with the largely attended residues such as hydrogen bonds within CDR3. The dataset that we created and the ability of our model to provide an interpretable prediction of TCR-peptide binding should increase our knowledge about molecular recognition and pave the way for designing new therapeutics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1274599"},"PeriodicalIF":0.0,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10759225/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139089614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信