{"title":"基于活动崖的QSAR模型预测效率计算的数据集分析","authors":"F. Adilova, Alisher Ikramov","doi":"10.4172/2379-1764.1000216","DOIUrl":null,"url":null,"abstract":"The activity cliff concept is of high relevance for medicinal chemistry. Herein, we explore a concept of “data set modelability”, i.e., a priori estimate of the feasibility to obtain externally predictive QSAR models for a data set of bioactive compounds. This concept has emerged from analyzing the effect of so-called “activity cliffs” on the overall performance of QSAR models. Some indexes of “modelability” (SALI, ISAC, and MODI) are known already. We extended the version of MODI to data sets of compounds with real activity values. The predictive efficiency of QSAR models is expressed as the correct classification rate by SVM algorithm, which compared with the results of the other two algorithms: algorithm MODI and Voronin’s algorithm modified by the authors. Comparative analysis of the results performed using Pearson’s correlation coefficient square. Our study showed an extreme lack of evaluation of predictive efficiency of data set only based on “activity cliffs”. In the development of more accurate methods that allow to evaluate the possibility of building of effective models on the data samples, it is necessary to take into account other properties of the sample, and not only the presence (and number) of “activity cliffs”.","PeriodicalId":7277,"journal":{"name":"Advanced techniques in biology & medicine","volume":"103 1","pages":"1-3"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Data Set Analysis for the Calculation of the QSAR Models Predictive Efficiency Based on Activity Cliffs\",\"authors\":\"F. Adilova, Alisher Ikramov\",\"doi\":\"10.4172/2379-1764.1000216\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The activity cliff concept is of high relevance for medicinal chemistry. Herein, we explore a concept of “data set modelability”, i.e., a priori estimate of the feasibility to obtain externally predictive QSAR models for a data set of bioactive compounds. This concept has emerged from analyzing the effect of so-called “activity cliffs” on the overall performance of QSAR models. Some indexes of “modelability” (SALI, ISAC, and MODI) are known already. We extended the version of MODI to data sets of compounds with real activity values. The predictive efficiency of QSAR models is expressed as the correct classification rate by SVM algorithm, which compared with the results of the other two algorithms: algorithm MODI and Voronin’s algorithm modified by the authors. Comparative analysis of the results performed using Pearson’s correlation coefficient square. Our study showed an extreme lack of evaluation of predictive efficiency of data set only based on “activity cliffs”. In the development of more accurate methods that allow to evaluate the possibility of building of effective models on the data samples, it is necessary to take into account other properties of the sample, and not only the presence (and number) of “activity cliffs”.\",\"PeriodicalId\":7277,\"journal\":{\"name\":\"Advanced techniques in biology & medicine\",\"volume\":\"103 1\",\"pages\":\"1-3\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced techniques in biology & medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4172/2379-1764.1000216\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced techniques in biology & medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2379-1764.1000216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Set Analysis for the Calculation of the QSAR Models Predictive Efficiency Based on Activity Cliffs
The activity cliff concept is of high relevance for medicinal chemistry. Herein, we explore a concept of “data set modelability”, i.e., a priori estimate of the feasibility to obtain externally predictive QSAR models for a data set of bioactive compounds. This concept has emerged from analyzing the effect of so-called “activity cliffs” on the overall performance of QSAR models. Some indexes of “modelability” (SALI, ISAC, and MODI) are known already. We extended the version of MODI to data sets of compounds with real activity values. The predictive efficiency of QSAR models is expressed as the correct classification rate by SVM algorithm, which compared with the results of the other two algorithms: algorithm MODI and Voronin’s algorithm modified by the authors. Comparative analysis of the results performed using Pearson’s correlation coefficient square. Our study showed an extreme lack of evaluation of predictive efficiency of data set only based on “activity cliffs”. In the development of more accurate methods that allow to evaluate the possibility of building of effective models on the data samples, it is necessary to take into account other properties of the sample, and not only the presence (and number) of “activity cliffs”.