arXiv - QuanBio - Biomolecules最新文献

筛选
英文 中文
Specific Nucleic Acid Detection Using a Nanoparticle Hybridization Assay 利用纳米粒子杂交测定进行特异性核酸检测
arXiv - QuanBio - Biomolecules Pub Date : 2024-09-06 DOI: arxiv-2409.03983
A. A. Aldakheel, C. B. Raub, H. T. Bui
{"title":"Specific Nucleic Acid Detection Using a Nanoparticle Hybridization Assay","authors":"A. A. Aldakheel, C. B. Raub, H. T. Bui","doi":"arxiv-2409.03983","DOIUrl":"https://doi.org/arxiv-2409.03983","url":null,"abstract":"Simple methods to detect biomolecules including specific nucleic acid\u0000sequences have received renewed attention since the Severe Acute Respiratory\u0000Syndrome Coronavirus 2 (SARS-CoV-2) virus pandemic. Notably, biomolecule\u0000detection that uses some form of signal amplification will have some form of\u0000amplification-related error, which in the polymerase chain reaction involves\u0000mispriming and subsequent signal amplification in the no template control,\u0000ultimately providing a limit of detection. To demonstrate the feasibility of\u0000the detection of a DNA target sequence without molecular or chemical signal\u0000amplification that avoids amplification errors, a gold nanoparticle aggregation\u0000assay was developed and tested. Two primers bracketing a 94 base pair target\u0000sequence from SARS-CoV-2 were conjugated to 10 nm diameter gold nanoparticles\u0000by the salt aging method, with conjugation and primer-target hybridization\u0000confirmed by agarose gel electrophoresis and nanospectrophotometry. Upon mixing\u0000of both conjugated nanoparticles with target, a surface plasmon resonance shift\u0000of 6 nm was observed, and lower electrophoretic mobility of a band containing\u0000both DNA fluorescence and gold absorption signals. This did not occur in the\u0000presence of a control DNA molecule of the same size and composition as the\u0000target but with a randomly scrambled base position. Nanoparticle tracking at 30\u0000frames per second using a sensitive darkfield microscope revealed a lower\u0000measured diffusion coefficient of scattering objects in the target mixture than\u0000in the control mixture or with bare gold nanoparticles.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"283 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-scale analysis of the CzrA transcription repressor highlights the allosteric changes induced by metal ion binding 对 CzrA 转录抑制因子的多尺度分析凸显了金属离子结合诱导的异构变化
arXiv - QuanBio - Biomolecules Pub Date : 2024-09-05 DOI: arxiv-2409.03584
Marta Rigoli, Raffaello Potestio, Roberto Menichetti
{"title":"A multi-scale analysis of the CzrA transcription repressor highlights the allosteric changes induced by metal ion binding","authors":"Marta Rigoli, Raffaello Potestio, Roberto Menichetti","doi":"arxiv-2409.03584","DOIUrl":"https://doi.org/arxiv-2409.03584","url":null,"abstract":"Allosteric regulation is a widespread strategy employed by several proteins\u0000to transduce chemical signals and perform biological functions. Metal sensor\u0000proteins are exemplary in this respect, e.g., in that they selectively bind and\u0000unbind DNA depending on the state of a distal ion coordination site. In this\u0000work, we carry out an investigation of the structural and mechanical properties\u0000of the CzrA transcription repressor through the analysis of microsecond-long\u0000molecular dynamics (MD) trajectories; the latter are processed through the\u0000mapping entropy optimisation workflow (MEOW), a recently developed\u0000information-theoretical method that highlights, in an unsupervised manner,\u0000residues of particular mechanical, functional, and biological importance. This\u0000approach allows us to unveil how differences in the properties of the molecule\u0000are controlled by the state of the zinc coordination site, with particular\u0000attention to the DNA binding region. These changes correlate with a\u0000redistribution of the conformational variability of the residues throughout the\u0000molecule, in spite of an overall consistency of its architecture in the two\u0000(ion-bound and free) coordination states. The results of this work corroborate\u0000previous studies, provide novel insight into the fine details of the mechanics\u0000of CzrA, and showcase the MEOW approach as a novel instrument for the study of\u0000allosteric regulation and other processes in proteins through the analysis of\u0000plain MD simulations.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiview Random Vector Functional Link Network for Predicting DNA-Binding Proteins 用于预测 DNA 结合蛋白的多视图随机向量功能链接网络
arXiv - QuanBio - Biomolecules Pub Date : 2024-09-04 DOI: arxiv-2409.02588
A. Quadir, M. Sajid, M. Tanveer
{"title":"Multiview Random Vector Functional Link Network for Predicting DNA-Binding Proteins","authors":"A. Quadir, M. Sajid, M. Tanveer","doi":"arxiv-2409.02588","DOIUrl":"https://doi.org/arxiv-2409.02588","url":null,"abstract":"The identification of DNA-binding proteins (DBPs) is a critical task due to\u0000their significant impact on various biological activities. Understanding the\u0000mechanisms underlying protein-DNA interactions is essential for elucidating\u0000various life activities. In recent years, machine learning-based models have\u0000been prominently utilized for DBP prediction. In this paper, to predict DBPs,\u0000we propose a novel framework termed a multiview random vector functional link\u0000(MvRVFL) network, which fuses neural network architecture with multiview\u0000learning. The proposed MvRVFL model combines the benefits of late and early\u0000fusion, allowing for distinct regularization parameters across different views\u0000while leveraging a closed-form solution to determine unknown parameters\u0000efficiently. The primal objective function incorporates a coupling term aimed\u0000at minimizing a composite of errors stemming from all views. From each of the\u0000three protein views of the DBP datasets, we extract five features. These\u0000features are then fused together by incorporating a hidden feature during the\u0000model training process. The performance of the proposed MvRVFL model on the DBP\u0000dataset surpasses that of baseline models, demonstrating its superior\u0000effectiveness. Furthermore, we extend our assessment to the UCI, KEEL, AwA, and\u0000Corel5k datasets, to establish the practicality of the proposed models. The\u0000consistency error bound, the generalization error bound, and empirical\u0000findings, coupled with rigorous statistical analyses, confirm the superior\u0000generalization capabilities of the MvRVFL model compared to the baseline\u0000models.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Methods to Investigate Intrinsically Disordered Proteins and their Complexes 研究本征无序蛋白及其复合物的计算方法
arXiv - QuanBio - Biomolecules Pub Date : 2024-09-03 DOI: arxiv-2409.02240
Zi Hao Liu, Maria Tsanai, Oufan Zhang, Julie Forman-Kay, Teresa Head-Gordon
{"title":"Computational Methods to Investigate Intrinsically Disordered Proteins and their Complexes","authors":"Zi Hao Liu, Maria Tsanai, Oufan Zhang, Julie Forman-Kay, Teresa Head-Gordon","doi":"arxiv-2409.02240","DOIUrl":"https://doi.org/arxiv-2409.02240","url":null,"abstract":"In 1999 Wright and Dyson highlighted the fact that large sections of the\u0000proteome of all organisms are comprised of protein sequences that lack globular\u0000folded structures under physiological conditions. Since then the biophysics\u0000community has made significant strides in unraveling the intricate structural\u0000and dynamic characteristics of intrinsically disordered proteins (IDPs) and\u0000intrinsically disordered regions (IDRs). Unlike crystallographic beamlines and\u0000their role in streamlining acquisition of structures for folded proteins, an\u0000integrated experimental and computational approach aimed at IDPs/IDRs has\u0000emerged. In this Perspective we aim to provide a robust overview of current\u0000computational tools for IDPs and IDRs, and most recently their complexes and\u0000phase separated states, including statistical models, physics-based approaches,\u0000and machine learning methods that permit structural ensemble generation and\u0000validation against many solution experimental data types.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization 超越效率:分子数据剪枝以增强通用性
arXiv - QuanBio - Biomolecules Pub Date : 2024-09-02 DOI: arxiv-2409.01081
Dingshuo Chen, Zhixun Li, Yuyan Ni, Guibin Zhang, Ding Wang, Qiang Liu, Shu Wu, Jeffrey Xu Yu, Liang Wang
{"title":"Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization","authors":"Dingshuo Chen, Zhixun Li, Yuyan Ni, Guibin Zhang, Ding Wang, Qiang Liu, Shu Wu, Jeffrey Xu Yu, Liang Wang","doi":"arxiv-2409.01081","DOIUrl":"https://doi.org/arxiv-2409.01081","url":null,"abstract":"With the emergence of various molecular tasks and massive datasets, how to\u0000perform efficient training has become an urgent yet under-explored issue in the\u0000area. Data pruning (DP), as an oft-stated approach to saving training burdens,\u0000filters out less influential samples to form a coreset for training. However,\u0000the increasing reliance on pretrained models for molecular tasks renders\u0000traditional in-domain DP methods incompatible. Therefore, we propose a\u0000Molecular data Pruning framework for enhanced Generalization (MolPeg), which\u0000focuses on the source-free data pruning scenario, where data pruning is applied\u0000with pretrained models. By maintaining two models with different updating paces\u0000during training, we introduce a novel scoring function to measure the\u0000informativeness of samples based on the loss discrepancy. As a plug-and-play\u0000framework, MolPeg realizes the perception of both source and target domain and\u0000consistently outperforms existing DP methods across four downstream tasks.\u0000Remarkably, it can surpass the performance obtained from full-dataset training,\u0000even when pruning up to 60-70% of the data on HIV and PCBA dataset. Our work\u0000suggests that the discovery of effective data-pruning metrics could provide a\u0000viable path to both enhanced efficiency and superior generalization in transfer\u0000learning.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Deep Generative Model For Computational Protein Design And Optimization 利用深度生成模型进行计算蛋白质设计和优化
arXiv - QuanBio - Biomolecules Pub Date : 2024-08-30 DOI: arxiv-2408.17241
Boqiao Lai
{"title":"Leveraging Deep Generative Model For Computational Protein Design And Optimization","authors":"Boqiao Lai","doi":"arxiv-2408.17241","DOIUrl":"https://doi.org/arxiv-2408.17241","url":null,"abstract":"Proteins are the fundamental macromolecules that play diverse and crucial\u0000roles in all living matter and have tremendous implications in healthcare,\u0000manufacturing, and biotechnology. Their functions are largely determined by the\u0000sequences of amino acids that compose them and their unique three-dimensional\u0000structures when folded. The recent surge in highly accurate computational\u0000protein structure prediction tools has equipped scientists with the means to\u0000derive preliminary structural insights without the onerous costs of\u0000experimental structure determination. These breakthroughs hold profound promise\u0000for building robust and efficient in silico protein design systems. While the prospect of designing de novo proteins with precise computational\u0000accuracy remains a grand challenge in biochemical engineering, conventional\u0000assembly-based and rational design methods often grapple with the expansive\u0000design space, resulting in suboptimal design success rates. Despite recently\u0000emerged deep learning-based models have shown promise in improving the\u0000efficiency of the computational protein design process, a significant gap\u0000persists between current design paradigms and their experimental realization.\u0000This thesis will investigate the potential of deep generative models in\u0000refining protein structure and sequence design methods, aiming to develop\u0000frameworks capable of crafting novel protein sequences with predetermined\u0000structures or specific functionalities. By harnessing extensive protein\u0000databases and cutting-edge neural architectures, this research aims to enhance\u0000precision and robustness in current protein design paradigms, potentially\u0000paving the way for advancements across various scientific fields.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative Prediction of Protein-Polyelectrolyte Binding Thermodynamics: Adsorption of Heparin-Analog Polysulfates to the SARS-CoV-2 Spike Protein RBD 蛋白质-多电解质结合热力学的定量预测:肝素类似物多硫酸盐对 SARS-CoV-2 Spike 蛋白 RBD 的吸附作用
arXiv - QuanBio - Biomolecules Pub Date : 2024-08-30 DOI: arxiv-2409.00210
Lenard Neander, Cedric Hannemann, Roland R. Netz, Anil Kumar Sahoo
{"title":"Quantitative Prediction of Protein-Polyelectrolyte Binding Thermodynamics: Adsorption of Heparin-Analog Polysulfates to the SARS-CoV-2 Spike Protein RBD","authors":"Lenard Neander, Cedric Hannemann, Roland R. Netz, Anil Kumar Sahoo","doi":"arxiv-2409.00210","DOIUrl":"https://doi.org/arxiv-2409.00210","url":null,"abstract":"Interactions of polyelectrolytes (PEs) with proteins play a crucial role in\u0000numerous biological processes, such as the internalization of virus particles\u0000into host cells. Although docking, machine learning methods, and molecular\u0000dynamics (MD) simulations are utilized to estimate binding poses and binding\u0000free energies of small-molecule drugs to proteins, quantitative prediction of\u0000the binding thermodynamics of PE-based drugs presents a significant obstacle in\u0000computer-aided drug design. This is due to the sluggish dynamics of PEs caused\u0000by their size and strong charge-charge correlations. In this paper, we\u0000introduce advanced sampling methods based on a force-spectroscopy setup and\u0000theoretical modeling to overcome this barrier. We exemplify our method with\u0000explicit solvent all-atom MD simulations of interactions of anionic PEs that\u0000show antiviral properties, namely heparin and linear polyglycerol sulfate\u0000(LPGS), with the SARS-CoV-2 spike protein receptor binding domain (RBD). Our\u0000prediction for the binding free energy of LPGS to the wild-type RBD matches\u0000experimentally measured dissociation constants within thermal energy, kT, and\u0000correctly reproduces the experimental PE-length dependence. We find that LPGS\u0000binds to the Delta-variant RBD with an additional free-energy gain of 2.4 kT,\u0000compared to the wild-type RBD, in accord with electrostatic arguments. We show\u0000that the LPGS-RBD binding is solvent-dominated and enthalpy-driven, though with\u0000a large entropy-enthalpy compensation. Our method is applicable to general\u0000polymer adsorption phenomena and predicts precise binding free energies and\u0000re-configurational friction as needed for drug and drug-delivery design.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Report of HelixFold3 for Biomolecular Structure Prediction 用于生物分子结构预测的 HelixFold3 技术报告
arXiv - QuanBio - Biomolecules Pub Date : 2024-08-30 DOI: arxiv-2408.16975
Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Xiaonan Zhang, Xiaomin Fang
{"title":"Technical Report of HelixFold3 for Biomolecular Structure Prediction","authors":"Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Xiaonan Zhang, Xiaomin Fang","doi":"arxiv-2408.16975","DOIUrl":"https://doi.org/arxiv-2408.16975","url":null,"abstract":"The AlphaFold series has transformed protein structure prediction with\u0000remarkable accuracy, often matching experimental methods. AlphaFold2,\u0000AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in\u0000predicting single protein chains, protein complexes, and biomolecular\u0000structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced,\u0000facilitating rapid and reliable predictions, AlphaFold3 remains partially\u0000accessible through a limited online server and has not been open-sourced,\u0000restricting further development. To address these challenges, the PaddleHelix\u0000team is developing HelixFold3, aiming to replicate AlphaFold3's capabilities.\u0000Using insights from previous models and extensive datasets, HelixFold3 achieves\u0000an accuracy comparable to AlphaFold3 in predicting the structures of\u0000conventional ligands, nucleic acids, and proteins. The initial release of\u0000HelixFold3 is available as open source on GitHub for academic research,\u0000promising to advance biomolecular research and accelerate discoveries. We also\u0000provide online service at PaddleHelix website at\u0000https://paddlehelix.baidu.com/app/all/helixfold3/forecast.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-Scale Multi-omic Biosequence Transformers for Modeling Peptide-Nucleotide Interactions 为肽-核苷酸相互作用建模的大规模多组生物序列转换器
arXiv - QuanBio - Biomolecules Pub Date : 2024-08-29 DOI: arxiv-2408.16245
Sully F. Chen, Robert J. Steele, Beakal Lemeneh, Shivanand P. Lad, Eric Oermann
{"title":"Large-Scale Multi-omic Biosequence Transformers for Modeling Peptide-Nucleotide Interactions","authors":"Sully F. Chen, Robert J. Steele, Beakal Lemeneh, Shivanand P. Lad, Eric Oermann","doi":"arxiv-2408.16245","DOIUrl":"https://doi.org/arxiv-2408.16245","url":null,"abstract":"The transformer architecture has revolutionized bioinformatics and driven\u0000progress in the understanding and prediction of the properties of biomolecules.\u0000Almost all research on large-scale biosequence transformers has focused on one\u0000domain at a time (single-omic), usually nucleotides or peptides. These models\u0000have seen incredible success in downstream tasks in each domain and have\u0000achieved particularly noteworthy breakthroughs in sequences of peptides and\u0000structural modeling. However, these single-omic models are naturally incapable\u0000of modeling multi-omic tasks, one of the most biologically critical being\u0000nucleotide-peptide interactions. We present our work training the first multi-omic nucleotide-peptide\u0000foundation models. We show that these multi-omic models (MOMs) can learn joint\u0000representations between various single-omic distributions that are emergently\u0000consistent with the Central Dogma of molecular biology, despite only being\u0000trained on unlabeled biosequences. We further demonstrate that MOMs can be\u0000fine-tuned to achieve state-of-the-art results on peptide-nucleotide\u0000interaction tasks, namely predicting the change in Gibbs free energy\u0000({Delta}G) of the binding interaction between a given oligonucleotide and\u0000peptide, as well as the effect on this binding interaction due to mutations in\u0000the oligonucleotide sequence ({Delta}{Delta}G). Remarkably, we show that multi-omic biosequence transformers emergently learn\u0000useful structural information without any prior structural training, allowing\u0000us to predict which peptide residues are most involved in the\u0000peptide-nucleotide binding interaction. Lastly, we provide evidence that\u0000multi-omic biosequence models are non-inferior to foundation models trained on\u0000single-omics distributions, suggesting a more generalized or foundational\u0000approach to building these models.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"318 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search S-MolSearch:用于生物活性分子搜索的 3D 半监督对比学习
arXiv - QuanBio - Biomolecules Pub Date : 2024-08-27 DOI: arxiv-2409.07462
Gengmo Zhou, Zhen Wang, Feng Yu, Guolin Ke, Zhewei Wei, Zhifeng Gao
{"title":"S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search","authors":"Gengmo Zhou, Zhen Wang, Feng Yu, Guolin Ke, Zhewei Wei, Zhifeng Gao","doi":"arxiv-2409.07462","DOIUrl":"https://doi.org/arxiv-2409.07462","url":null,"abstract":"Virtual Screening is an essential technique in the early phases of drug\u0000discovery, aimed at identifying promising drug candidates from vast molecular\u0000libraries. Recently, ligand-based virtual screening has garnered significant\u0000attention due to its efficacy in conducting extensive database screenings\u0000without relying on specific protein-binding site information. Obtaining binding\u0000affinity data for complexes is highly expensive, resulting in a limited amount\u0000of available data that covers a relatively small chemical space. Moreover,\u0000these datasets contain a significant amount of inconsistent noise. It is\u0000challenging to identify an inductive bias that consistently maintains the\u0000integrity of molecular activity during data augmentation. To tackle these\u0000challenges, we propose S-MolSearch, the first framework to our knowledge, that\u0000leverages molecular 3D information and affinity information in semi-supervised\u0000contrastive learning for ligand-based virtual screening. Drawing on the\u0000principles of inverse optimal transport, S-MolSearch efficiently processes both\u0000labeled and unlabeled data, training molecular structural encoders while\u0000generating soft labels for the unlabeled data. This design allows S-MolSearch\u0000to adaptively utilize unlabeled data within the learning process. Empirically,\u0000S-MolSearch demonstrates superior performance on widely-used benchmarks\u0000LIT-PCBA and DUD-E. It surpasses both structure-based and ligand-based virtual\u0000screening methods for enrichment factors across 0.5%, 1% and 5%.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信