Application of Rough Set Theory to High Throughput Screening Data for Rational Selection of Lead Compounds

IF 0.8 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Chem-Bio Informatics Journal Pub Date : 2008-01-01 DOI:10.1273/CBIJ.8.85

Michio Koyama, K. Hasegawa, Masamoto Arakawa, K. Funatsu

{"title":"Application of Rough Set Theory to High Throughput Screening Data for Rational Selection of Lead Compounds","authors":"Michio Koyama, K. Hasegawa, Masamoto Arakawa, K. Funatsu","doi":"10.1273/CBIJ.8.85","DOIUrl":null,"url":null,"abstract":"In the field of drug discovery, high-throughput screening (HTS) is widely used to identify new lead compounds. A considerable number of hit compounds, however, will subsequently be found to have low activities when their inhibitory activities are measured more precisely. Such compounds are called false positives. For a more efficient selection of lead compounds, virtual screening methods with QSAR models have been investigated, but no definitive solutions have been found. In this study, we propose an effective method to identify lead compounds. The proposed method is based on rough set theory (RST), which is a mathematical tool for depicting the uncertainty and vagueness of knowledge. The essential parts of RST are the construction of reducts, which are minimal subsets of variables to distinguish samples, and the extraction of rules using their reducts. By applying RST to the QSAR study of monoamine oxidase (MAO) inhibitors, we extracted several rules for identifying lead compounds. First, 3D-structures of MAO inhibitors were generated uniformly by CORINA, and chemical descriptors were calculated by the Volsurf method. Finally, three unique rules were extracted by using RST. It is found that the each rule is chemically reasonable and compatible with previous studies. Furthermore, the predictive power of RST was also proved by comparison with partial least squares (PLS) and decision tree (DT). These results demonstrate the usefulness of our method.","PeriodicalId":40659,"journal":{"name":"Chem-Bio Informatics Journal","volume":"52 1","pages":"85-95"},"PeriodicalIF":0.8000,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chem-Bio Informatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1273/CBIJ.8.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 5

Abstract

In the field of drug discovery, high-throughput screening (HTS) is widely used to identify new lead compounds. A considerable number of hit compounds, however, will subsequently be found to have low activities when their inhibitory activities are measured more precisely. Such compounds are called false positives. For a more efficient selection of lead compounds, virtual screening methods with QSAR models have been investigated, but no definitive solutions have been found. In this study, we propose an effective method to identify lead compounds. The proposed method is based on rough set theory (RST), which is a mathematical tool for depicting the uncertainty and vagueness of knowledge. The essential parts of RST are the construction of reducts, which are minimal subsets of variables to distinguish samples, and the extraction of rules using their reducts. By applying RST to the QSAR study of monoamine oxidase (MAO) inhibitors, we extracted several rules for identifying lead compounds. First, 3D-structures of MAO inhibitors were generated uniformly by CORINA, and chemical descriptors were calculated by the Volsurf method. Finally, three unique rules were extracted by using RST. It is found that the each rule is chemically reasonable and compatible with previous studies. Furthermore, the predictive power of RST was also proved by comparison with partial least squares (PLS) and decision tree (DT). These results demonstrate the usefulness of our method.

查看原文本刊更多论文

粗集理论在高通量筛选数据中合理选择先导化合物的应用

在药物发现领域，高通量筛选(high-throughput screening, HTS)被广泛用于鉴定新的先导化合物。然而，当更精确地测量它们的抑制活性时，会发现相当多的hit化合物具有低活性。这种化合物被称为假阳性。为了更有效地选择先导化合物，已经研究了使用QSAR模型的虚拟筛选方法，但没有找到明确的解决方案。在这项研究中，我们提出了一种有效的方法来识别先导化合物。该方法基于粗糙集理论(RST)，粗糙集理论是描述知识不确定性和模糊性的数学工具。RST的关键部分是约简的构造，它是用于区分样本的变量的最小子集，以及使用它们的约简提取规则。通过将RST应用于单胺氧化酶抑制剂的QSAR研究，我们提取了几种识别先导化合物的规则。最后，利用RST提取出三条独特的规则。结果表明，这些规律在化学上是合理的，与前人的研究结果是一致的。通过与偏最小二乘(PLS)和决策树(DT)的比较，验证了RST的预测能力。这些结果证明了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Chem-Bio Informatics Journal BIOCHEMISTRY & MOLECULAR BIOLOGY-

CiteScore

0.60

自引率

0.00%

发文量