Abbas Karimi-Fard, Abbas Saidi, Masoud Tohidfar, Seyedeh Noushin Emami
{"title":"Integrative machine learning and RT-qPCR analysis identify key stress-responsive genes in Thermus thermophilus HB8.","authors":"Abbas Karimi-Fard, Abbas Saidi, Masoud Tohidfar, Seyedeh Noushin Emami","doi":"10.1007/s10709-025-00243-6","DOIUrl":null,"url":null,"abstract":"<p><p>Bacteria are constantly exposed to diverse environmental stresses, necessitating complex adaptive mechanisms for survival. Thermus thermophilus, a thermophilic extremophile, serves as an excellent model for investigating these responses due to its remarkable resilience to harsh conditions. Recent advances in artificial intelligence, particularly in machine learning, have transformed the identification of novel stress-responsive biomarkers. In this study, we analyzed transcriptomic data from 65 T. thermophilus HB8 samples subjected to various abiotic stresses to identify key genes involved in stress adaptation. We applied a suite of supervised machine learning algorithms to classify samples and prioritize informative features. Among the tested models, Extreme Gradient Boosting (XGBoost) and Random Forest (RF) achieved the highest classification performance, with XGBoost attaining perfect discrimination between stressed and control samples (AUC = 1.00) and RF closely following (AUC = 0.99). Feature importance analysis consistently identified three candidate genes: TTHA0029, TTHA1720, and TTHA1359. Functional validation using RT-qPCR confirmed the significant upregulation of TTHA0029 and TTHA1720 under salt and hydrogen peroxide stress, suggesting roles in redox regulation and ionic homeostasis. Phylogenetic analysis further revealed the specificity of these genes to the Thermus genus. Overall, our findings highlight central molecular players in stress tolerance in T. thermophilus and demonstrate the utility of machine learning in biomarker discovery. The identified genes, TTHA0029 and TTHA1720, may serve as promising targets for genetic engineering to improve stress resilience in both crops and industrially relevant microorganisms.</p>","PeriodicalId":55121,"journal":{"name":"Genetica","volume":"153 1","pages":"28"},"PeriodicalIF":1.3000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetica","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s10709-025-00243-6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Bacteria are constantly exposed to diverse environmental stresses, necessitating complex adaptive mechanisms for survival. Thermus thermophilus, a thermophilic extremophile, serves as an excellent model for investigating these responses due to its remarkable resilience to harsh conditions. Recent advances in artificial intelligence, particularly in machine learning, have transformed the identification of novel stress-responsive biomarkers. In this study, we analyzed transcriptomic data from 65 T. thermophilus HB8 samples subjected to various abiotic stresses to identify key genes involved in stress adaptation. We applied a suite of supervised machine learning algorithms to classify samples and prioritize informative features. Among the tested models, Extreme Gradient Boosting (XGBoost) and Random Forest (RF) achieved the highest classification performance, with XGBoost attaining perfect discrimination between stressed and control samples (AUC = 1.00) and RF closely following (AUC = 0.99). Feature importance analysis consistently identified three candidate genes: TTHA0029, TTHA1720, and TTHA1359. Functional validation using RT-qPCR confirmed the significant upregulation of TTHA0029 and TTHA1720 under salt and hydrogen peroxide stress, suggesting roles in redox regulation and ionic homeostasis. Phylogenetic analysis further revealed the specificity of these genes to the Thermus genus. Overall, our findings highlight central molecular players in stress tolerance in T. thermophilus and demonstrate the utility of machine learning in biomarker discovery. The identified genes, TTHA0029 and TTHA1720, may serve as promising targets for genetic engineering to improve stress resilience in both crops and industrially relevant microorganisms.
期刊介绍:
Genetica publishes papers dealing with genetics, genomics, and evolution. Our journal covers novel advances in the fields of genomics, conservation genetics, genotype-phenotype interactions, evo-devo, population and quantitative genetics, and biodiversity. Genetica publishes original research articles addressing novel conceptual, experimental, and theoretical issues in these areas, whatever the taxon considered. Biomedical papers and papers on breeding animal and plant genetics are not within the scope of Genetica, unless framed in an evolutionary context. Recent advances in genetics, genomics and evolution are also published in thematic issues and synthesis papers published by experts in the field.