A. Bassoli, M. Drew, Channa K. Hattotuwagama, L. Merlini, G. Morini, Gareth R. H. Wilden
{"title":"Quantitative Structure-Activity Relationships of Sweet Isovanillyl Derivatives","authors":"A. Bassoli, M. Drew, Channa K. Hattotuwagama, L. Merlini, G. Morini, Gareth R. H. Wilden","doi":"10.1002/1521-3838(200105)20:1<3::AID-QSAR3>3.0.CO;2-H","DOIUrl":null,"url":null,"abstract":"Isovanillyl derivatives constitute a large class of sweet compounds in which there is a high degree of structural similarity and a wide range of biological activity, the relative sweetness RS spanning from 50 to 10 000 times with respect to sucrose. This paper describes the results obtained by applying statistical models to develop QSARs for these derivatives. For a set of 14 compounds (set 1) appropriate physicochemical parameters for regression equations were selected using the genetic algorithm method. The best equation indicates a very close correlation (N=14, ND=5, r2=0.982, Rcv2=0.942, LOF=0.074, PRESS=0.271, SPRESS=0.184, SDEP=0.139). Good results have also been obtained by Molecular Field Analysis (MFA) applied to the same set of compounds (N=14, ND=4, r2=0.957, rcv2=0.925, LOF=0.044, PRESS=0.348, SPRESS=0.196, SDEP=0.158). QSARs have also been derived for a larger set of 41 compounds (set 2, including set 1, plus other 27 compounds) with a much larger variety of structural types. These compounds have been divided into a training set of 35 compounds and a test set of 6 compounds. The most significant QSAR obtained using physicochemical parameters (N=35, ND=6, r2=0.673, rcv2=0.522, LOF 0.337, PRESS=7.432, SPRESS=0.515, SDEP=0.461) proved less successful than one using MFA parameters (N=35, ND=6, r2=0.746, rcv2=0.607, LOF 0.261, PRESS=6.110, SPRESS=0.467, SDEP=0.418). PRESS values for the test set were 4.079 and 1.962 respectively showing that the MFA data had more predictive power. Equations with different numbers of descriptors were compared and it was concluded that the LOF which is dependent upon the number of parameters used as well as the sum of squares is a suitable measure of equation quality. These equations were also validated by scrambling the experimental data which gave significantly worse agreement than the real data except when an excessive number of descriptors was used.","PeriodicalId":20818,"journal":{"name":"Quantitative Structure-activity Relationships","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2001-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Structure-activity Relationships","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/1521-3838(200105)20:1<3::AID-QSAR3>3.0.CO;2-H","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Isovanillyl derivatives constitute a large class of sweet compounds in which there is a high degree of structural similarity and a wide range of biological activity, the relative sweetness RS spanning from 50 to 10 000 times with respect to sucrose. This paper describes the results obtained by applying statistical models to develop QSARs for these derivatives. For a set of 14 compounds (set 1) appropriate physicochemical parameters for regression equations were selected using the genetic algorithm method. The best equation indicates a very close correlation (N=14, ND=5, r2=0.982, Rcv2=0.942, LOF=0.074, PRESS=0.271, SPRESS=0.184, SDEP=0.139). Good results have also been obtained by Molecular Field Analysis (MFA) applied to the same set of compounds (N=14, ND=4, r2=0.957, rcv2=0.925, LOF=0.044, PRESS=0.348, SPRESS=0.196, SDEP=0.158). QSARs have also been derived for a larger set of 41 compounds (set 2, including set 1, plus other 27 compounds) with a much larger variety of structural types. These compounds have been divided into a training set of 35 compounds and a test set of 6 compounds. The most significant QSAR obtained using physicochemical parameters (N=35, ND=6, r2=0.673, rcv2=0.522, LOF 0.337, PRESS=7.432, SPRESS=0.515, SDEP=0.461) proved less successful than one using MFA parameters (N=35, ND=6, r2=0.746, rcv2=0.607, LOF 0.261, PRESS=6.110, SPRESS=0.467, SDEP=0.418). PRESS values for the test set were 4.079 and 1.962 respectively showing that the MFA data had more predictive power. Equations with different numbers of descriptors were compared and it was concluded that the LOF which is dependent upon the number of parameters used as well as the sum of squares is a suitable measure of equation quality. These equations were also validated by scrambling the experimental data which gave significantly worse agreement than the real data except when an excessive number of descriptors was used.