M. Ozdemir, M. Embrechts, F. Arciniegas, C. Breneman, L. Lockwood, Kristin P. Bennett
{"title":"基于遗传算法和神经网络的芯片药物设计特征选择","authors":"M. Ozdemir, M. Embrechts, F. Arciniegas, C. Breneman, L. Lockwood, Kristin P. Bennett","doi":"10.1109/SMCIA.2001.936728","DOIUrl":null,"url":null,"abstract":"QSAR (quantitative structure activity relationship) is a discipline within computational chemistry that deals with predictive modeling, often for relatively small datasets where the number of features might exceed the number of data points, leading to extreme dimensionality problems. The paper addresses a novel feature selection procedure for QSAR based on genetic algorithms to reduce the curse of dimensionality problem. In this case the genetic algorithm minimizes a cost function derived from the correlation matrix between the features and the activity of interest that is being modeled. From a QSAR dataset with 160 features, the genetic algorithm selected a feature subset (40 features), which built a better predictive model than with full feature set. The results for feature reduction with genetic algorithm were also compared with neural network sensitivity analysis.","PeriodicalId":104202,"journal":{"name":"SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Feature selection for in-silico drug design using genetic algorithms and neural networks\",\"authors\":\"M. Ozdemir, M. Embrechts, F. Arciniegas, C. Breneman, L. Lockwood, Kristin P. Bennett\",\"doi\":\"10.1109/SMCIA.2001.936728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"QSAR (quantitative structure activity relationship) is a discipline within computational chemistry that deals with predictive modeling, often for relatively small datasets where the number of features might exceed the number of data points, leading to extreme dimensionality problems. The paper addresses a novel feature selection procedure for QSAR based on genetic algorithms to reduce the curse of dimensionality problem. In this case the genetic algorithm minimizes a cost function derived from the correlation matrix between the features and the activity of interest that is being modeled. From a QSAR dataset with 160 features, the genetic algorithm selected a feature subset (40 features), which built a better predictive model than with full feature set. The results for feature reduction with genetic algorithm were also compared with neural network sensitivity analysis.\",\"PeriodicalId\":104202,\"journal\":{\"name\":\"SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SMCIA.2001.936728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMCIA.2001.936728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature selection for in-silico drug design using genetic algorithms and neural networks
QSAR (quantitative structure activity relationship) is a discipline within computational chemistry that deals with predictive modeling, often for relatively small datasets where the number of features might exceed the number of data points, leading to extreme dimensionality problems. The paper addresses a novel feature selection procedure for QSAR based on genetic algorithms to reduce the curse of dimensionality problem. In this case the genetic algorithm minimizes a cost function derived from the correlation matrix between the features and the activity of interest that is being modeled. From a QSAR dataset with 160 features, the genetic algorithm selected a feature subset (40 features), which built a better predictive model than with full feature set. The results for feature reduction with genetic algorithm were also compared with neural network sensitivity analysis.