Michio Koyama, K. Hasegawa, Masamoto Arakawa, K. Funatsu
{"title":"Application of Rough Set Theory to High Throughput Screening Data for Rational Selection of Lead Compounds","authors":"Michio Koyama, K. Hasegawa, Masamoto Arakawa, K. Funatsu","doi":"10.1273/CBIJ.8.85","DOIUrl":null,"url":null,"abstract":"In the field of drug discovery, high-throughput screening (HTS) is widely used to identify new lead compounds. A considerable number of hit compounds, however, will subsequently be found to have low activities when their inhibitory activities are measured more precisely. Such compounds are called false positives. For a more efficient selection of lead compounds, virtual screening methods with QSAR models have been investigated, but no definitive solutions have been found. In this study, we propose an effective method to identify lead compounds. The proposed method is based on rough set theory (RST), which is a mathematical tool for depicting the uncertainty and vagueness of knowledge. The essential parts of RST are the construction of reducts, which are minimal subsets of variables to distinguish samples, and the extraction of rules using their reducts. By applying RST to the QSAR study of monoamine oxidase (MAO) inhibitors, we extracted several rules for identifying lead compounds. First, 3D-structures of MAO inhibitors were generated uniformly by CORINA, and chemical descriptors were calculated by the Volsurf method. Finally, three unique rules were extracted by using RST. It is found that the each rule is chemically reasonable and compatible with previous studies. Furthermore, the predictive power of RST was also proved by comparison with partial least squares (PLS) and decision tree (DT). These results demonstrate the usefulness of our method.","PeriodicalId":40659,"journal":{"name":"Chem-Bio Informatics Journal","volume":"52 1","pages":"85-95"},"PeriodicalIF":0.4000,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chem-Bio Informatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1273/CBIJ.8.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 5
Abstract
In the field of drug discovery, high-throughput screening (HTS) is widely used to identify new lead compounds. A considerable number of hit compounds, however, will subsequently be found to have low activities when their inhibitory activities are measured more precisely. Such compounds are called false positives. For a more efficient selection of lead compounds, virtual screening methods with QSAR models have been investigated, but no definitive solutions have been found. In this study, we propose an effective method to identify lead compounds. The proposed method is based on rough set theory (RST), which is a mathematical tool for depicting the uncertainty and vagueness of knowledge. The essential parts of RST are the construction of reducts, which are minimal subsets of variables to distinguish samples, and the extraction of rules using their reducts. By applying RST to the QSAR study of monoamine oxidase (MAO) inhibitors, we extracted several rules for identifying lead compounds. First, 3D-structures of MAO inhibitors were generated uniformly by CORINA, and chemical descriptors were calculated by the Volsurf method. Finally, three unique rules were extracted by using RST. It is found that the each rule is chemically reasonable and compatible with previous studies. Furthermore, the predictive power of RST was also proved by comparison with partial least squares (PLS) and decision tree (DT). These results demonstrate the usefulness of our method.