{"title":"基于分支定界法的分子组合高效叠加聚类算法。","authors":"Yuki Yamamoto","doi":"10.1021/acs.jcim.4c02217","DOIUrl":null,"url":null,"abstract":"The root-mean-square deviation (RMSD) is one of the most common metrics for comparing the similarity of three-dimensional chemical structures. The chemical structure similarity plays an important role in data chemistry because it is closely related to chemical reactivity, physical properties, and bioactivity. Despite the wide applicability of the RMSD, the simultaneous determination of atom mapping and spatial superposition of RMSD remains a challenging problem to solve in polynomial time. We introduce an algorithm called mobbRMSD, which is formulated in molecular-oriented coordinates and uses the branch-and-bound method to obtain an exact solution for the RMSD. mobbRMSD can efficiently handle a wide range of chemical systems, such as molecular liquids, solute solvations, and self-assembly of large molecules, using chemical knowledge such as atom types, chemical bonding, and chirality. In benchmarks involving small molecular aggregates, mobbRMSD extends the limiting system size of existing exact solution methods by almost twice. Furthermore, mobbRMSD demonstrated the ability to analyze the structural similarity of large molecular micelles, which has been difficult with previous methods. We also propose a mobbRMSD-based structural clustering method designed for molecular dynamics trajectories, which improves the computational cost of branch-and-bound methods to asymptotically average the polynomial time as the number of data increases. Our algorithm is freely available at https://github.com/yymmt742/mobbrmsd.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"16 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithm for Efficient Superposition and Clustering of Molecular Assemblies Using the Branch-and-Bound Method.\",\"authors\":\"Yuki Yamamoto\",\"doi\":\"10.1021/acs.jcim.4c02217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The root-mean-square deviation (RMSD) is one of the most common metrics for comparing the similarity of three-dimensional chemical structures. The chemical structure similarity plays an important role in data chemistry because it is closely related to chemical reactivity, physical properties, and bioactivity. Despite the wide applicability of the RMSD, the simultaneous determination of atom mapping and spatial superposition of RMSD remains a challenging problem to solve in polynomial time. We introduce an algorithm called mobbRMSD, which is formulated in molecular-oriented coordinates and uses the branch-and-bound method to obtain an exact solution for the RMSD. mobbRMSD can efficiently handle a wide range of chemical systems, such as molecular liquids, solute solvations, and self-assembly of large molecules, using chemical knowledge such as atom types, chemical bonding, and chirality. In benchmarks involving small molecular aggregates, mobbRMSD extends the limiting system size of existing exact solution methods by almost twice. Furthermore, mobbRMSD demonstrated the ability to analyze the structural similarity of large molecular micelles, which has been difficult with previous methods. We also propose a mobbRMSD-based structural clustering method designed for molecular dynamics trajectories, which improves the computational cost of branch-and-bound methods to asymptotically average the polynomial time as the number of data increases. Our algorithm is freely available at https://github.com/yymmt742/mobbrmsd.\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jcim.4c02217\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c02217","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
Algorithm for Efficient Superposition and Clustering of Molecular Assemblies Using the Branch-and-Bound Method.
The root-mean-square deviation (RMSD) is one of the most common metrics for comparing the similarity of three-dimensional chemical structures. The chemical structure similarity plays an important role in data chemistry because it is closely related to chemical reactivity, physical properties, and bioactivity. Despite the wide applicability of the RMSD, the simultaneous determination of atom mapping and spatial superposition of RMSD remains a challenging problem to solve in polynomial time. We introduce an algorithm called mobbRMSD, which is formulated in molecular-oriented coordinates and uses the branch-and-bound method to obtain an exact solution for the RMSD. mobbRMSD can efficiently handle a wide range of chemical systems, such as molecular liquids, solute solvations, and self-assembly of large molecules, using chemical knowledge such as atom types, chemical bonding, and chirality. In benchmarks involving small molecular aggregates, mobbRMSD extends the limiting system size of existing exact solution methods by almost twice. Furthermore, mobbRMSD demonstrated the ability to analyze the structural similarity of large molecular micelles, which has been difficult with previous methods. We also propose a mobbRMSD-based structural clustering method designed for molecular dynamics trajectories, which improves the computational cost of branch-and-bound methods to asymptotically average the polynomial time as the number of data increases. Our algorithm is freely available at https://github.com/yymmt742/mobbrmsd.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.