{"title":"通过建模辅助语义原型从低温et密度体积中发现分子结构。","authors":"Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu","doi":"10.1093/bib/bbae570","DOIUrl":null,"url":null,"abstract":"<p><p>Cryo-electron tomography (cryo-ET) is confronted with the intricate task of unveiling novel structures. General class discovery (GCD) seeks to identify new classes by learning a model that can pseudo-label unannotated (novel) instances solely using supervision from labeled (base) classes. While 2D GCD for image data has made strides, its 3D counterpart remains unexplored. Traditional methods encounter challenges due to model bias and limited feature transferability when clustering unlabeled 2D images into known and potentially novel categories based on labeled data. To address this limitation and extend GCD to 3D structures, we propose an innovative approach that harnesses a pretrained 2D transformer, enriched by an effective weight inflation strategy tailored for 3D adaptation, followed by a decoupled prototypical network. Incorporating the power of pretrained weight-inflated Transformers, we further integrate CLIP, a vision-language model to incorporate textual information. Our method synergizes a graph convolutional network with CLIP's frozen text encoder, preserving class neighborhood structure. In order to effectively represent unlabeled samples, we devise semantic distance distributions, by formulating a bipartite matching problem for category prototypes using a decoupled prototypical network. Empirical results unequivocally highlight our method's potential in unveiling hitherto unknown structures in cryo-ET. By bridging the gap between 2D GCD and the distinctive challenges of 3D cryo-ET data, our approach paves novel avenues for exploration and discovery in this domain.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11790060/pdf/","citationCount":"0","resultStr":"{\"title\":\"Towards molecular structure discovery from cryo-ET density volumes via modelling auxiliary semantic prototypes.\",\"authors\":\"Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu\",\"doi\":\"10.1093/bib/bbae570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Cryo-electron tomography (cryo-ET) is confronted with the intricate task of unveiling novel structures. General class discovery (GCD) seeks to identify new classes by learning a model that can pseudo-label unannotated (novel) instances solely using supervision from labeled (base) classes. While 2D GCD for image data has made strides, its 3D counterpart remains unexplored. Traditional methods encounter challenges due to model bias and limited feature transferability when clustering unlabeled 2D images into known and potentially novel categories based on labeled data. To address this limitation and extend GCD to 3D structures, we propose an innovative approach that harnesses a pretrained 2D transformer, enriched by an effective weight inflation strategy tailored for 3D adaptation, followed by a decoupled prototypical network. Incorporating the power of pretrained weight-inflated Transformers, we further integrate CLIP, a vision-language model to incorporate textual information. Our method synergizes a graph convolutional network with CLIP's frozen text encoder, preserving class neighborhood structure. In order to effectively represent unlabeled samples, we devise semantic distance distributions, by formulating a bipartite matching problem for category prototypes using a decoupled prototypical network. Empirical results unequivocally highlight our method's potential in unveiling hitherto unknown structures in cryo-ET. By bridging the gap between 2D GCD and the distinctive challenges of 3D cryo-ET data, our approach paves novel avenues for exploration and discovery in this domain.</p>\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11790060/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbae570\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae570","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Towards molecular structure discovery from cryo-ET density volumes via modelling auxiliary semantic prototypes.
Cryo-electron tomography (cryo-ET) is confronted with the intricate task of unveiling novel structures. General class discovery (GCD) seeks to identify new classes by learning a model that can pseudo-label unannotated (novel) instances solely using supervision from labeled (base) classes. While 2D GCD for image data has made strides, its 3D counterpart remains unexplored. Traditional methods encounter challenges due to model bias and limited feature transferability when clustering unlabeled 2D images into known and potentially novel categories based on labeled data. To address this limitation and extend GCD to 3D structures, we propose an innovative approach that harnesses a pretrained 2D transformer, enriched by an effective weight inflation strategy tailored for 3D adaptation, followed by a decoupled prototypical network. Incorporating the power of pretrained weight-inflated Transformers, we further integrate CLIP, a vision-language model to incorporate textual information. Our method synergizes a graph convolutional network with CLIP's frozen text encoder, preserving class neighborhood structure. In order to effectively represent unlabeled samples, we devise semantic distance distributions, by formulating a bipartite matching problem for category prototypes using a decoupled prototypical network. Empirical results unequivocally highlight our method's potential in unveiling hitherto unknown structures in cryo-ET. By bridging the gap between 2D GCD and the distinctive challenges of 3D cryo-ET data, our approach paves novel avenues for exploration and discovery in this domain.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.