Briefings in bioinformatics最新文献

筛选
英文 中文
GPSD: a hybrid learning framework for the prediction of phosphatase-specific dephosphorylation sites. GPSD:预测磷酸酶特异性去磷酸化位点的混合学习框架。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae694
Cheng Han, Shanshan Fu, Miaomiao Chen, Yujie Gou, Dan Liu, Chi Zhang, Xinhe Huang, Leming Xiao, Miaoying Zhao, Jiayi Zhang, Qiang Xiao, Di Peng, Yu Xue
{"title":"GPSD: a hybrid learning framework for the prediction of phosphatase-specific dephosphorylation sites.","authors":"Cheng Han, Shanshan Fu, Miaomiao Chen, Yujie Gou, Dan Liu, Chi Zhang, Xinhe Huang, Leming Xiao, Miaoying Zhao, Jiayi Zhang, Qiang Xiao, Di Peng, Yu Xue","doi":"10.1093/bib/bbae694","DOIUrl":"10.1093/bib/bbae694","url":null,"abstract":"<p><p>Protein phosphorylation is dynamically and reversibly regulated by protein kinases and protein phosphatases, and plays an essential role in orchestrating a wide range of biological processes. Although a number of tools have been developed for predicting kinase-specific phosphorylation sites (p-sites), computational prediction of phosphatase-specific dephosphorylation sites remains to be a great challenge. In this study, we manually curated 4393 experimentally identified site-specific phosphatase-substrate relationships for 3463 dephosphorylation sites occurring on phosphoserine, phosphothreonine, and/or phosphotyrosine residues, from the literature and public databases. Then, we developed a hybrid learning framework, the group-based prediction system for the prediction of phosphatase-specific dephosphorylation sites (GPSD). For model training, we integrated 10 types of sequence features and utilized three types of machine learning methods, including penalized logistic regression, deep neural networks, and transformer neural networks. First, a pretrained model was constructed using 561 416 nonredundant p-sites and then fine-tuned to generate computational models for predicting general dephosphorylation sites. In addition, 103 individual phosphatase-specific predictors were constructed via transfer learning and meta-learning. For site prediction, one or multiple protein sequences in FASTA format could be inputted, and the prediction results will be shown together with additional annotations, such as protein-protein interactions, structural information, and disorder propensity. The online service of GPSD is freely available at https://gpsd.biocuckoo.cn/. We believe that GPSD can serve as a valuable tool for further analysis of dephosphorylation.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142920865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel framework for phage-host prediction via logical probability theory and network sparsification. 基于逻辑概率论和网络稀疏化的噬菌体-宿主预测新框架。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae708
Ankang Wei, Huanghan Zhan, Zhen Xiao, Weizhong Zhao, Xingpeng Jiang
{"title":"A novel framework for phage-host prediction via logical probability theory and network sparsification.","authors":"Ankang Wei, Huanghan Zhan, Zhen Xiao, Weizhong Zhao, Xingpeng Jiang","doi":"10.1093/bib/bbae708","DOIUrl":"10.1093/bib/bbae708","url":null,"abstract":"<p><p>Bacterial resistance has emerged as one of the greatest threats to human health, and phages have shown tremendous potential in addressing the issue of drug-resistant bacteria by lysing host. The identification of phage-host interactions (PHI) is crucial for addressing bacterial infections. Some existing computational methods for predicting PHI are suboptimal in terms of prediction efficiency due to the limited types of available information. Despite the emergence of some supporting information, the generalizability of models using this information is limited by the small scale of the databases. Additionally, most existing models overlook the sparsity of association data, which severely impacts their predictive performance as well. In this study, we propose a dual-view sparse network model (DSPHI) to predict PHI, which leverages logical probability theory and network sparsification. Specifically, we first constructed similarity networks using the sequences of phages and hosts respectively, and then sparsified these networks, enabling the model to focus more on key information during the learning process, thereby improving prediction efficiency. Next, we utilize logical probability theory to compute high-order logical information between phages (hosts), which is known as mutual information. Subsequently, we connect this information in node form to the sparse phage (host) similarity network, resulting in a phage (host) heterogeneous network that better integrates the two information views, thereby reducing the complexity of model computation and enhancing information aggregation capabilities. The hidden features of phages and hosts are explored through graph learning algorithms. Experimental results demonstrate that mutual information is effective information in predicting PHI, and the sparsification procedure of similarity networks significantly improves the model's predictive performance.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711101/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142944458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProtGraph: a tool for the quick and comprehensive exploration and exploitation of the peptide search space derived from protein sequence databases using graphs. ProtGraph:一个工具,用于快速和全面的探索和利用肽搜索空间衍生的蛋白质序列数据库使用图形。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae671
Dominik Lux, Katrin Marcus-Alic, Martin Eisenacher, Julian Uszkoreit
{"title":"ProtGraph: a tool for the quick and comprehensive exploration and exploitation of the peptide search space derived from protein sequence databases using graphs.","authors":"Dominik Lux, Katrin Marcus-Alic, Martin Eisenacher, Julian Uszkoreit","doi":"10.1093/bib/bbae671","DOIUrl":"https://doi.org/10.1093/bib/bbae671","url":null,"abstract":"<p><p>Due to computational resource limitations, in mass spectrometry based proteomics only a limited set of peptide sequences is used for the matching against measured spectra. We present an approach to represent proteins by graphs and allow not only the canonical sequences but also known isoforms and annotated amino acid variations, e.g. originating from genomic mutations, and further common protein sequence features contained in Uniprot KB or other protein databases. Our C++ and Python implementation enables a groundbreaking comprehensive characterization of the peptide search space, encompassing for the first time all available annotations in a protein database (in combination more than $10^{200}$ possibilities). Additionally, it can be used to quickly extract the relevant subset of the search space for peptide to spectrum matching, e.g. filtering by the peptide mass. We demonstrate the advantages and innovative findings of our implementation compared to previous workflows by re-analysing publicly available datasets.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142930610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: BANMF-S: a blockwise accelerated non-negative matrix factorization framework with structural network constraints for single cell imputation. BANMF-S:用于单细胞输入的具有结构网络约束的块加速非负矩阵分解框架。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf034
{"title":"Correction to: BANMF-S: a blockwise accelerated non-negative matrix factorization framework with structural network constraints for single cell imputation.","authors":"","doi":"10.1093/bib/bbaf034","DOIUrl":"https://doi.org/10.1093/bib/bbaf034","url":null,"abstract":"","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based design and experimental validation of a medicine-like human antibody library.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf023
Nandhini Rajagopal, Udit Choudhary, Kenny Tsang, Kyle P Martin, Murat Karadag, Hsin-Ting Chen, Na-Young Kwon, Joseph Mozdzierz, Alexander M Horspool, Li Li, Peter M Tessier, Michael S Marlow, Andrew E Nixon, Sandeep Kumar
{"title":"Deep learning-based design and experimental validation of a medicine-like human antibody library.","authors":"Nandhini Rajagopal, Udit Choudhary, Kenny Tsang, Kyle P Martin, Murat Karadag, Hsin-Ting Chen, Na-Young Kwon, Joseph Mozdzierz, Alexander M Horspool, Li Li, Peter M Tessier, Michael S Marlow, Andrew E Nixon, Sandeep Kumar","doi":"10.1093/bib/bbaf023","DOIUrl":"10.1093/bib/bbaf023","url":null,"abstract":"<p><p>Antibody generation requires the use of one or more time-consuming methods, namely animal immunization, and in vitro display technologies. However, the recent availability of large amounts of antibody sequence and structural data in the public domain along with the advent of generative deep learning algorithms raises the possibility of computationally generating novel antibody sequences with desirable developability attributes. Here, we describe a deep learning model for computationally generating libraries of highly human antibody variable regions whose intrinsic physicochemical properties resemble those of the variable regions of the marketed antibody-based biotherapeutics (medicine-likeness). We generated 100000 variable region sequences of antigen-agnostic human antibodies belonging to the IGHV3-IGKV1 germline pair using a training dataset of 31416 human antibodies that satisfied our computational developability criteria. The in-silico generated antibodies recapitulate intrinsic sequence, structural, and physicochemical properties of the training antibodies, and compare favorably with the experimentally measured biophysical attributes of 100 variable regions of marketed and clinical stage antibody-based biotherapeutics. A sample of 51 highly diverse in-silico generated antibodies with >90th percentile medicine-likeness and > 90% humanness was evaluated by two independent experimental laboratories. Our data show the in-silico generated sequences exhibit high expression, monomer content, and thermal stability along with low hydrophobicity, self-association, and non-specific binding when produced as full-length monoclonal antibodies. The ability to computationally generate developable human antibody libraries is a first step towards enabling in-silico discovery of antibody-based biotherapeutics. These findings are expected to accelerate in-silico discovery of antibody-based biotherapeutics and expand the druggable antigen space to include targets refractory to conventional antibody discovery methods requiring in vitro antigen production.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143027968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Therapeutic gene target prediction using novel deep hypergraph representation learning.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf019
Kibeom Kim, Juseong Kim, Minwook Kim, Hyewon Lee, Giltae Song
{"title":"Therapeutic gene target prediction using novel deep hypergraph representation learning.","authors":"Kibeom Kim, Juseong Kim, Minwook Kim, Hyewon Lee, Giltae Song","doi":"10.1093/bib/bbaf019","DOIUrl":"10.1093/bib/bbaf019","url":null,"abstract":"<p><p>Identifying therapeutic genes is crucial for developing treatments targeting genetic causes of diseases, but experimental trials are costly and time-consuming. Although many deep learning approaches aim to identify biomarker genes, predicting therapeutic target genes remains challenging due to the limited number of known targets. To address this, we propose HIT (Hypergraph Interaction Transformer), a deep hypergraph representation learning model that identifies a gene's therapeutic potential, biomarker status, or lack of association with diseases. HIT uses hypergraph structures of genes, ontologies, diseases, and phenotypes, employing attention-based learning to capture complex relationships. Experiments demonstrate HIT's state-of-the-art performance, explainability, and ability to identify novel therapeutic targets.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11752618/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143022258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PHPGAT: predicting phage hosts based on multimodal heterogeneous knowledge graph with graph attention network. PHPGAT:基于多模态异构知识图和图注意网络的噬菌体宿主预测。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf017
Fu Liu, Zhimiao Zhao, Yun Liu
{"title":"PHPGAT: predicting phage hosts based on multimodal heterogeneous knowledge graph with graph attention network.","authors":"Fu Liu, Zhimiao Zhao, Yun Liu","doi":"10.1093/bib/bbaf017","DOIUrl":"10.1093/bib/bbaf017","url":null,"abstract":"<p><p>Antibiotic resistance poses a significant threat to global health, making the development of alternative strategies to combat bacterial pathogens increasingly urgent. One such promising approach is the strategic use of bacteriophages (or phages) to specifically target and eradicate antibiotic-resistant bacteria. Phages, being among the most prevalent life forms on Earth, play a critical role in maintaining ecological balance by regulating bacterial communities and driving genetic diversity. Accurate prediction of phage hosts is essential for successfully applying phage therapy. However, existing prediction models may not fully encapsulate the complex dynamics of phage-host interactions in diverse microbial environments, indicating a need for improved accuracy through more sophisticated modeling techniques. In response to this challenge, this study introduces a novel phage-host prediction model, PHPGAT, which leverages a multimodal heterogeneous knowledge graph with the advanced GATv2 (Graph Attention Network v2) framework. The model first constructs a multimodal heterogeneous knowledge graph by integrating phage-phage, host-host, and phage-host interactions to capture the intricate connections between biological entities. GATv2 is then employed to extract deep node features and learn dynamic interdependencies, generating context-aware embeddings. Finally, an inner product decoder is designed to compute the likelihood of interaction between a phage and host pair based on the embedding vectors produced by GATv2. Evaluation results using two datasets demonstrate that PHPGAT achieves precise phage host predictions and outperforms other models. PHPGAT is available at https://github.com/ZhaoZMer/PHPGAT.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11745545/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AntiBinder: utilizing bidirectional attention and hybrid encoding for precise antibody-antigen interaction prediction. AntiBinder:利用双向注意和混合编码进行精确的抗体-抗原相互作用预测。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf008
Kaiwen Zhang, Yuhao Tao, Fei Wang
{"title":"AntiBinder: utilizing bidirectional attention and hybrid encoding for precise antibody-antigen interaction prediction.","authors":"Kaiwen Zhang, Yuhao Tao, Fei Wang","doi":"10.1093/bib/bbaf008","DOIUrl":"10.1093/bib/bbaf008","url":null,"abstract":"<p><p>Antibodies play a key role in medical diagnostics and therapeutics. Accurately predicting antibody-antigen binding is essential for developing effective treatments. Traditional protein-protein interaction prediction methods often fall short because they do not account for the unique structural and dynamic properties of antibodies and antigens. In this study, we present AntiBinder, a novel predictive model specifically designed to address these challenges. AntiBinder integrates the unique structural and sequence characteristics of antibodies and antigens into its framework and employs a bidirectional cross-attention mechanism to automatically learn the intrinsic mechanisms of antigen-antibody binding, eliminating the need for manual feature engineering. Our comprehensive experiments, which include predicting interactions between known antigens and new antibodies, predicting the binding of previously unseen antigens, and predicting cross-species antigen-antibody interactions, demonstrate that AntiBinder outperforms existing state-of-the-art methods. Notably, AntiBinder excels in predicting interactions with unseen antigens and maintains a reasonable level of predictive capability in challenging cross-species prediction tasks. AntiBinder's ability to model complex antigen-antibody interactions highlights its potential applications in biomedical research and therapeutic development, including the design of vaccines and antibody therapies for rapidly emerging infectious diseases.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11744619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph contrastive learning of subcellular-resolution spatial transcriptomics improves cell type annotation and reveals critical molecular pathways.
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf020
Qiaolin Lu, Jiayuan Ding, Lingxiao Li, Yi Chang
{"title":"Graph contrastive learning of subcellular-resolution spatial transcriptomics improves cell type annotation and reveals critical molecular pathways.","authors":"Qiaolin Lu, Jiayuan Ding, Lingxiao Li, Yi Chang","doi":"10.1093/bib/bbaf020","DOIUrl":"10.1093/bib/bbaf020","url":null,"abstract":"<p><p>Imaging-based spatial transcriptomics (iST), such as MERFISH, CosMx SMI, and Xenium, quantify gene expression level across cells in space, but more importantly, they directly reveal the subcellular distribution of RNA transcripts at the single-molecule resolution. The subcellular localization of RNA molecules plays a crucial role in the compartmentalization-dependent regulation of genes within individual cells. Understanding the intracellular spatial distribution of RNA for a particular cell type thus not only improves the characterization of cell identity but also is of paramount importance in elucidating unique subcellular regulatory mechanisms specific to the cell type. However, current cell type annotation approaches of iST primarily utilize gene expression information while neglecting the spatial distribution of RNAs within cells. In this work, we introduce a semi-supervised graph contrastive learning method called Focus, the first method, to the best of our knowledge, that explicitly models RNA's subcellular distribution and community to improve cell type annotation. Focus demonstrates significant improvements over state-of-the-art algorithms across a range of spatial transcriptomics platforms, achieving improvements up to 27.8% in terms of accuracy and 51.9% in terms of F1-score for cell type annotation. Furthermore, Focus enjoys the advantages of intricate cell type-specific subcellular spatial gene patterns and providing interpretable subcellular gene analysis, such as defining the gene importance score. Importantly, with the importance score, Focus identifies genes harboring strong relevance to cell type-specific pathways, indicating its potential in uncovering novel regulatory programs across numerous biological systems.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11781232/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards molecular structure discovery from cryo-ET density volumes via modelling auxiliary semantic prototypes. 通过建模辅助语义原型从低温et密度体积中发现分子结构。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae570
Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu
{"title":"Towards molecular structure discovery from cryo-ET density volumes via modelling auxiliary semantic prototypes.","authors":"Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu","doi":"10.1093/bib/bbae570","DOIUrl":"10.1093/bib/bbae570","url":null,"abstract":"<p><p>Cryo-electron tomography (cryo-ET) is confronted with the intricate task of unveiling novel structures. General class discovery (GCD) seeks to identify new classes by learning a model that can pseudo-label unannotated (novel) instances solely using supervision from labeled (base) classes. While 2D GCD for image data has made strides, its 3D counterpart remains unexplored. Traditional methods encounter challenges due to model bias and limited feature transferability when clustering unlabeled 2D images into known and potentially novel categories based on labeled data. To address this limitation and extend GCD to 3D structures, we propose an innovative approach that harnesses a pretrained 2D transformer, enriched by an effective weight inflation strategy tailored for 3D adaptation, followed by a decoupled prototypical network. Incorporating the power of pretrained weight-inflated Transformers, we further integrate CLIP, a vision-language model to incorporate textual information. Our method synergizes a graph convolutional network with CLIP's frozen text encoder, preserving class neighborhood structure. In order to effectively represent unlabeled samples, we devise semantic distance distributions, by formulating a bipartite matching problem for category prototypes using a decoupled prototypical network. Empirical results unequivocally highlight our method's potential in unveiling hitherto unknown structures in cryo-ET. By bridging the gap between 2D GCD and the distinctive challenges of 3D cryo-ET data, our approach paves novel avenues for exploration and discovery in this domain.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11790060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信