{"title":"机器学习在异烟肼耐药性分析中的应用","authors":"Zhou Yang","doi":"10.1145/3571532.3571541","DOIUrl":null,"url":null,"abstract":"Correct and timely detection of Mycobacterium tuberculosis (MTB) resistance against existing tuberculosis (TB) drugs is essential for the limit of TB amplification. The objectives of the projects are (1) to develop classification models that help isoniazid-resistant TB diagnosis, (2) to find the best performed classification algorithm, and (3) to rank the gene mutations according to feature importance. The python sklearn and matplotlib packages were frequently utilized throughout the research for data curation, classification model development, and feature importance ranking. Additionally, area under the curve (AUC), precision, sensitivity, specificity, F1 score, and correct classification rate measured for model performances, and Gini importance calculated feature importance. Gradient boosting found to overperform other classification models with the highest accuracy mean 0f 0.852, and its overfitting error exposed the need for dimensionality reduction prior to model training. Gene 625 and 331 were the most significant features in this project, and this suggested the potential of machine learning (ML) to find new resistance makers. The results confirmed the application of ML in clinical settings for quicker and better prediction of drug resistance based on large genome sequencing data. With future studies focusing on less studied and second-line TB drugs, classification models could decrease mortality and prevent the amplification of existing antibiotic resistance by allowing early diagnosis and treatment. CCS CONCEPTS • Computing methodologies∼Machine learning∼Learning paradigms∼Supervised learning∼Supervised learning by classification","PeriodicalId":355088,"journal":{"name":"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of machine learning to Isoniazid resistance analysis\",\"authors\":\"Zhou Yang\",\"doi\":\"10.1145/3571532.3571541\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Correct and timely detection of Mycobacterium tuberculosis (MTB) resistance against existing tuberculosis (TB) drugs is essential for the limit of TB amplification. The objectives of the projects are (1) to develop classification models that help isoniazid-resistant TB diagnosis, (2) to find the best performed classification algorithm, and (3) to rank the gene mutations according to feature importance. The python sklearn and matplotlib packages were frequently utilized throughout the research for data curation, classification model development, and feature importance ranking. Additionally, area under the curve (AUC), precision, sensitivity, specificity, F1 score, and correct classification rate measured for model performances, and Gini importance calculated feature importance. Gradient boosting found to overperform other classification models with the highest accuracy mean 0f 0.852, and its overfitting error exposed the need for dimensionality reduction prior to model training. Gene 625 and 331 were the most significant features in this project, and this suggested the potential of machine learning (ML) to find new resistance makers. The results confirmed the application of ML in clinical settings for quicker and better prediction of drug resistance based on large genome sequencing data. With future studies focusing on less studied and second-line TB drugs, classification models could decrease mortality and prevent the amplification of existing antibiotic resistance by allowing early diagnosis and treatment. CCS CONCEPTS • Computing methodologies∼Machine learning∼Learning paradigms∼Supervised learning∼Supervised learning by classification\",\"PeriodicalId\":355088,\"journal\":{\"name\":\"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3571532.3571541\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571532.3571541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Application of machine learning to Isoniazid resistance analysis
Correct and timely detection of Mycobacterium tuberculosis (MTB) resistance against existing tuberculosis (TB) drugs is essential for the limit of TB amplification. The objectives of the projects are (1) to develop classification models that help isoniazid-resistant TB diagnosis, (2) to find the best performed classification algorithm, and (3) to rank the gene mutations according to feature importance. The python sklearn and matplotlib packages were frequently utilized throughout the research for data curation, classification model development, and feature importance ranking. Additionally, area under the curve (AUC), precision, sensitivity, specificity, F1 score, and correct classification rate measured for model performances, and Gini importance calculated feature importance. Gradient boosting found to overperform other classification models with the highest accuracy mean 0f 0.852, and its overfitting error exposed the need for dimensionality reduction prior to model training. Gene 625 and 331 were the most significant features in this project, and this suggested the potential of machine learning (ML) to find new resistance makers. The results confirmed the application of ML in clinical settings for quicker and better prediction of drug resistance based on large genome sequencing data. With future studies focusing on less studied and second-line TB drugs, classification models could decrease mortality and prevent the amplification of existing antibiotic resistance by allowing early diagnosis and treatment. CCS CONCEPTS • Computing methodologies∼Machine learning∼Learning paradigms∼Supervised learning∼Supervised learning by classification