Advances in information retrieval : ... European Conference on IR Research, ECIR ... proceedings. European Conference on IR Research最新文献

筛选
英文 中文
Utilizing Low-Dimensional Molecular Embeddings for Rapid Chemical Similarity Search. 利用低维分子嵌入进行快速化学相似性搜索。
Kathryn E Kirchoff, James Wellnitz, Joshua E Hochuli, Travis Maxfield, Konstantin I Popov, Shawn Gomez, Alexander Tropsha
{"title":"Utilizing Low-Dimensional Molecular Embeddings for Rapid Chemical Similarity Search.","authors":"Kathryn E Kirchoff, James Wellnitz, Joshua E Hochuli, Travis Maxfield, Konstantin I Popov, Shawn Gomez, Alexander Tropsha","doi":"10.1007/978-3-031-56060-6_3","DOIUrl":"https://doi.org/10.1007/978-3-031-56060-6_3","url":null,"abstract":"<p><p>Nearest neighbor-based similarity searching is a common task in chemistry, with notable use cases in drug discovery. Yet, some of the most commonly used approaches for this task still leverage a brute-force approach. In practice this can be computationally costly and overly time-consuming, due in part to the sheer size of modern chemical databases. Previous computational advancements for this task have generally relied on improvements to hardware or dataset-specific tricks that lack generalizability. Approaches that leverage lower-complexity searching algorithms remain relatively underexplored. However, many of these algorithms are approximate solutions and/or struggle with typical high-dimensional chemical embeddings. Here we evaluate whether a combination of low-dimensional chemical embeddings and a <i>k</i>-d tree data structure can achieve fast nearest neighbor queries while maintaining performance on standard chemical similarity search benchmarks. We examine different dimensionality reductions of standard chemical embeddings as well as a learned, structurally-aware embedding-SmallSA-for this task. With this framework, searches on over one billion chemicals execute in less than a second on a single CPU core, five orders of magnitude faster than the brute-force approach. We also demonstrate that SmallSA achieves competitive performance on chemical similarity benchmarks.</p>","PeriodicalId":519896,"journal":{"name":"Advances in information retrieval : ... European Conference on IR Research, ECIR ... proceedings. European Conference on IR Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10998712/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140871357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信