{"title":"Repurformer: Transformers for Repurposing-Aware Molecule Generation","authors":"Changhun Lee, Gyumin Lee","doi":"arxiv-2407.11439","DOIUrl":null,"url":null,"abstract":"Generating as diverse molecules as possible with desired properties is\ncrucial for drug discovery research, which invokes many approaches based on\ndeep generative models today. Despite recent advancements in these models,\nparticularly in variational autoencoders (VAEs), generative adversarial\nnetworks (GANs), Transformers, and diffusion models, a significant challenge\nknown as \\textit{the sample bias problem} remains. This problem occurs when\ngenerated molecules targeting the same protein tend to be structurally similar,\nreducing the diversity of generation. To address this, we propose leveraging\nmulti-hop relationships among proteins and compounds. Our model, Repurformer,\nintegrates bi-directional pretraining with Fast Fourier Transform (FFT) and\nlow-pass filtering (LPF) to capture complex interactions and generate diverse\nmolecules. A series of experiments on BindingDB dataset confirm that\nRepurformer successfully creates substitutes for anchor compounds that resemble\npositive compounds, increasing diversity between the anchor and generated\ncompounds.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.11439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generating as diverse molecules as possible with desired properties is
crucial for drug discovery research, which invokes many approaches based on
deep generative models today. Despite recent advancements in these models,
particularly in variational autoencoders (VAEs), generative adversarial
networks (GANs), Transformers, and diffusion models, a significant challenge
known as \textit{the sample bias problem} remains. This problem occurs when
generated molecules targeting the same protein tend to be structurally similar,
reducing the diversity of generation. To address this, we propose leveraging
multi-hop relationships among proteins and compounds. Our model, Repurformer,
integrates bi-directional pretraining with Fast Fourier Transform (FFT) and
low-pass filtering (LPF) to capture complex interactions and generate diverse
molecules. A series of experiments on BindingDB dataset confirm that
Repurformer successfully creates substitutes for anchor compounds that resemble
positive compounds, increasing diversity between the anchor and generated
compounds.