{"title":"基于集成学习方法的化合物药代动力学性质分类与预测模型","authors":"Jiayi Zhao, Yang Liu","doi":"10.1145/3500931.3501021","DOIUrl":null,"url":null,"abstract":"In this paper, the absorption, distribution, metabolism, excretion, and toxicity of compounds are modeled, and the classification prediction models of Caco-2, CYP3A4, HERG, hob and Mn in ADMET properties are constructed respectively. Firstly, the main variables corresponding to the five indicators are obtained and the special data set is constructed. Then, two sets of integrated learning schemes, bagging integrated decision tree and boosting integrated GBDT, are used for modeling. At the same time, logical regression and naive Bayesian algorithm is used for classification prediction as the control group to construct the classification model. Finally, ACC, F1 and other indexes are used as model evaluation indexes to select the optimal model of each index. The results show that the characteristic distributions of Mn and HERG, Caco-2, CYP3A4 and HOB are similar.","PeriodicalId":364880,"journal":{"name":"Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Classification and prediction model of compound pharmacokinetic properties based on ensemble learning method\",\"authors\":\"Jiayi Zhao, Yang Liu\",\"doi\":\"10.1145/3500931.3501021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the absorption, distribution, metabolism, excretion, and toxicity of compounds are modeled, and the classification prediction models of Caco-2, CYP3A4, HERG, hob and Mn in ADMET properties are constructed respectively. Firstly, the main variables corresponding to the five indicators are obtained and the special data set is constructed. Then, two sets of integrated learning schemes, bagging integrated decision tree and boosting integrated GBDT, are used for modeling. At the same time, logical regression and naive Bayesian algorithm is used for classification prediction as the control group to construct the classification model. Finally, ACC, F1 and other indexes are used as model evaluation indexes to select the optimal model of each index. The results show that the characteristic distributions of Mn and HERG, Caco-2, CYP3A4 and HOB are similar.\",\"PeriodicalId\":364880,\"journal\":{\"name\":\"Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3500931.3501021\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3500931.3501021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification and prediction model of compound pharmacokinetic properties based on ensemble learning method
In this paper, the absorption, distribution, metabolism, excretion, and toxicity of compounds are modeled, and the classification prediction models of Caco-2, CYP3A4, HERG, hob and Mn in ADMET properties are constructed respectively. Firstly, the main variables corresponding to the five indicators are obtained and the special data set is constructed. Then, two sets of integrated learning schemes, bagging integrated decision tree and boosting integrated GBDT, are used for modeling. At the same time, logical regression and naive Bayesian algorithm is used for classification prediction as the control group to construct the classification model. Finally, ACC, F1 and other indexes are used as model evaluation indexes to select the optimal model of each index. The results show that the characteristic distributions of Mn and HERG, Caco-2, CYP3A4 and HOB are similar.