{"title":"距离轮廓(DiP):一种平动和旋转不变的三维结构描述符,捕捉分子的空间性质","authors":"K. Baumann","doi":"10.1002/1521-3838(200211)21:5<507::AID-QSAR507>3.0.CO;2-L","DOIUrl":null,"url":null,"abstract":"A novel translationally and rotationally invariant structure descriptor based on the distribution of 3D-atom pairs is described. The new Distance Profiles (DiP) descriptor was applied to two data sets which were previously studied with various 3D-QSAR techniques. DiP compares favorably to the other descriptors for these two data sets and obtains better models in both cases. Since DiP is used in combination with variable selection to achieve interpretability, special emphasize was put on validating the derived models. Avoiding overfitted models was accomplished by constraining the maximum number of variables allowed to select, and by using leave-50%-out cross-validation instead of leave-one-out cross-validation as objective function in variable selection. Furthermore, the derived models were validated with a permutation test where the entire variable selection procedure is repeated each time the response data are scrambled.","PeriodicalId":20818,"journal":{"name":"Quantitative Structure-activity Relationships","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Distance Profiles (DiP): A translationally and rotationally invariant 3D structure descriptor capturing steric properties of molecules\",\"authors\":\"K. Baumann\",\"doi\":\"10.1002/1521-3838(200211)21:5<507::AID-QSAR507>3.0.CO;2-L\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel translationally and rotationally invariant structure descriptor based on the distribution of 3D-atom pairs is described. The new Distance Profiles (DiP) descriptor was applied to two data sets which were previously studied with various 3D-QSAR techniques. DiP compares favorably to the other descriptors for these two data sets and obtains better models in both cases. Since DiP is used in combination with variable selection to achieve interpretability, special emphasize was put on validating the derived models. Avoiding overfitted models was accomplished by constraining the maximum number of variables allowed to select, and by using leave-50%-out cross-validation instead of leave-one-out cross-validation as objective function in variable selection. Furthermore, the derived models were validated with a permutation test where the entire variable selection procedure is repeated each time the response data are scrambled.\",\"PeriodicalId\":20818,\"journal\":{\"name\":\"Quantitative Structure-activity Relationships\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantitative Structure-activity Relationships\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/1521-3838(200211)21:5<507::AID-QSAR507>3.0.CO;2-L\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Structure-activity Relationships","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/1521-3838(200211)21:5<507::AID-QSAR507>3.0.CO;2-L","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distance Profiles (DiP): A translationally and rotationally invariant 3D structure descriptor capturing steric properties of molecules
A novel translationally and rotationally invariant structure descriptor based on the distribution of 3D-atom pairs is described. The new Distance Profiles (DiP) descriptor was applied to two data sets which were previously studied with various 3D-QSAR techniques. DiP compares favorably to the other descriptors for these two data sets and obtains better models in both cases. Since DiP is used in combination with variable selection to achieve interpretability, special emphasize was put on validating the derived models. Avoiding overfitted models was accomplished by constraining the maximum number of variables allowed to select, and by using leave-50%-out cross-validation instead of leave-one-out cross-validation as objective function in variable selection. Furthermore, the derived models were validated with a permutation test where the entire variable selection procedure is repeated each time the response data are scrambled.