Louise S. Price, Matteo Paloni, Matteo Salvalaglio and Sarah L. Price*,
{"title":"一刀切?用于测试计算建模方法的cpos209实验和假设多态数据集的开发","authors":"Louise S. Price, Matteo Paloni, Matteo Salvalaglio and Sarah L. Price*, ","doi":"10.1021/acs.cgd.5c0025510.1021/acs.cgd.5c00255","DOIUrl":null,"url":null,"abstract":"<p >Organic crystal structure prediction (CSP) studies have led to the rapid development of methods for predicting the relative energies of known and computer-generated crystal structures. There is a compromise between the level of theoretical treatment, its reliability across different types of organic systems, how its accuracy depends on the size and shape of the unit cell, and the size and the number of structures that can be modeled at an affordable computational cost. We have used our database of crystal structure prediction studies, often performed as a complement to experimental screening, to produce sets comprising 6 to 15 crystal structures, covering known polymorphs, observed packings of closely related molecules, and CSP-generated energetically competitive but distinct structures, for 20 organic molecules. These have been chosen to illustrate some of the issues that need consideration in any lattice energy method, seeking to be generally applicable to moderate-sized organic molecules, including small drug molecules. We included the methods of crystallization reported for the experimental polymorphs. In all of the examples, the original CSP used electronic structure calculations on the molecule to give the conformational energy and an anisotropic atom–atom model for the electrostatic intermolecular energy, combined with an empirical “exp-6” repulsion dispersion model to give the intermolecular lattice energy. The lattice energies and structures are compared with those obtained by reoptimizing with periodic, plane-wave, dispersion-corrected density functional theory, specifically PBE with the TS dispersion correction, and with single point energies where the many body dispersion (MBD) dispersion correction is applied, as an example of a widely used “workhorse” method. The use of this data set for a preliminary test of modeling methods is illustrated for two Machine Learned Foundation Models, MACE-MP-0 and MACE-OFF23. The challenges in modeling the putative and observed polymorphs for a range of molecules, their energies, and the possible level of agreement with experimental data are illustrated. Very similar molecules can differ significantly in the polymorphs observed, only partially reflecting the range of polymorph screening experiments used and the energetically competitive structures produced by CSP approaches based on a purely thermodynamic paradigm.</p><p >Derivation and illustrative use of the CPOSS209 dataset of 209 experimental and hypothetical crystal structures of 20 small pharmaceutical molecules and precursors, as optimized by two well-established lattice energy models used in CSP, discussed in the context of current experimental knowledge.</p>","PeriodicalId":34,"journal":{"name":"Crystal Growth & Design","volume":"25 9","pages":"3186–3209 3186–3209"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.cgd.5c00255","citationCount":"0","resultStr":"{\"title\":\"One Size Fits All? Development of the CPOSS209 Data Set of Experimental and Hypothetical Polymorphs for Testing Computational Modeling Methods\",\"authors\":\"Louise S. Price, Matteo Paloni, Matteo Salvalaglio and Sarah L. Price*, \",\"doi\":\"10.1021/acs.cgd.5c0025510.1021/acs.cgd.5c00255\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Organic crystal structure prediction (CSP) studies have led to the rapid development of methods for predicting the relative energies of known and computer-generated crystal structures. There is a compromise between the level of theoretical treatment, its reliability across different types of organic systems, how its accuracy depends on the size and shape of the unit cell, and the size and the number of structures that can be modeled at an affordable computational cost. We have used our database of crystal structure prediction studies, often performed as a complement to experimental screening, to produce sets comprising 6 to 15 crystal structures, covering known polymorphs, observed packings of closely related molecules, and CSP-generated energetically competitive but distinct structures, for 20 organic molecules. These have been chosen to illustrate some of the issues that need consideration in any lattice energy method, seeking to be generally applicable to moderate-sized organic molecules, including small drug molecules. We included the methods of crystallization reported for the experimental polymorphs. In all of the examples, the original CSP used electronic structure calculations on the molecule to give the conformational energy and an anisotropic atom–atom model for the electrostatic intermolecular energy, combined with an empirical “exp-6” repulsion dispersion model to give the intermolecular lattice energy. The lattice energies and structures are compared with those obtained by reoptimizing with periodic, plane-wave, dispersion-corrected density functional theory, specifically PBE with the TS dispersion correction, and with single point energies where the many body dispersion (MBD) dispersion correction is applied, as an example of a widely used “workhorse” method. The use of this data set for a preliminary test of modeling methods is illustrated for two Machine Learned Foundation Models, MACE-MP-0 and MACE-OFF23. The challenges in modeling the putative and observed polymorphs for a range of molecules, their energies, and the possible level of agreement with experimental data are illustrated. Very similar molecules can differ significantly in the polymorphs observed, only partially reflecting the range of polymorph screening experiments used and the energetically competitive structures produced by CSP approaches based on a purely thermodynamic paradigm.</p><p >Derivation and illustrative use of the CPOSS209 dataset of 209 experimental and hypothetical crystal structures of 20 small pharmaceutical molecules and precursors, as optimized by two well-established lattice energy models used in CSP, discussed in the context of current experimental knowledge.</p>\",\"PeriodicalId\":34,\"journal\":{\"name\":\"Crystal Growth & Design\",\"volume\":\"25 9\",\"pages\":\"3186–3209 3186–3209\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/epdf/10.1021/acs.cgd.5c00255\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Crystal Growth & Design\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.cgd.5c00255\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Crystal Growth & Design","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.cgd.5c00255","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
One Size Fits All? Development of the CPOSS209 Data Set of Experimental and Hypothetical Polymorphs for Testing Computational Modeling Methods
Organic crystal structure prediction (CSP) studies have led to the rapid development of methods for predicting the relative energies of known and computer-generated crystal structures. There is a compromise between the level of theoretical treatment, its reliability across different types of organic systems, how its accuracy depends on the size and shape of the unit cell, and the size and the number of structures that can be modeled at an affordable computational cost. We have used our database of crystal structure prediction studies, often performed as a complement to experimental screening, to produce sets comprising 6 to 15 crystal structures, covering known polymorphs, observed packings of closely related molecules, and CSP-generated energetically competitive but distinct structures, for 20 organic molecules. These have been chosen to illustrate some of the issues that need consideration in any lattice energy method, seeking to be generally applicable to moderate-sized organic molecules, including small drug molecules. We included the methods of crystallization reported for the experimental polymorphs. In all of the examples, the original CSP used electronic structure calculations on the molecule to give the conformational energy and an anisotropic atom–atom model for the electrostatic intermolecular energy, combined with an empirical “exp-6” repulsion dispersion model to give the intermolecular lattice energy. The lattice energies and structures are compared with those obtained by reoptimizing with periodic, plane-wave, dispersion-corrected density functional theory, specifically PBE with the TS dispersion correction, and with single point energies where the many body dispersion (MBD) dispersion correction is applied, as an example of a widely used “workhorse” method. The use of this data set for a preliminary test of modeling methods is illustrated for two Machine Learned Foundation Models, MACE-MP-0 and MACE-OFF23. The challenges in modeling the putative and observed polymorphs for a range of molecules, their energies, and the possible level of agreement with experimental data are illustrated. Very similar molecules can differ significantly in the polymorphs observed, only partially reflecting the range of polymorph screening experiments used and the energetically competitive structures produced by CSP approaches based on a purely thermodynamic paradigm.
Derivation and illustrative use of the CPOSS209 dataset of 209 experimental and hypothetical crystal structures of 20 small pharmaceutical molecules and precursors, as optimized by two well-established lattice energy models used in CSP, discussed in the context of current experimental knowledge.
期刊介绍:
The aim of Crystal Growth & Design is to stimulate crossfertilization of knowledge among scientists and engineers working in the fields of crystal growth, crystal engineering, and the industrial application of crystalline materials.
Crystal Growth & Design publishes theoretical and experimental studies of the physical, chemical, and biological phenomena and processes related to the design, growth, and application of crystalline materials. Synergistic approaches originating from different disciplines and technologies and integrating the fields of crystal growth, crystal engineering, intermolecular interactions, and industrial application are encouraged.