aBIOTECHPub Date : 2024-07-31DOI: 10.1007/s42994-024-00178-0
Yunlong Liu, Morteza H. Ghaffari, Tao Ma, Yan Tu
{"title":"Impact of database choice and confidence score on the performance of taxonomic classification using Kraken2","authors":"Yunlong Liu, Morteza H. Ghaffari, Tao Ma, Yan Tu","doi":"10.1007/s42994-024-00178-0","DOIUrl":"10.1007/s42994-024-00178-0","url":null,"abstract":"<div><p>Accurate taxonomic classification is essential to understanding microbial diversity and function through metagenomic sequencing. However, this task is complicated by the vast variety of microbial genomes and the computational limitations of bioinformatics tools. The aim of this study was to evaluate the impact of reference database selection and confidence score (CS) settings on the performance of Kraken2, a widely used k-mer-based metagenomic classifier. In this study, we generated simulated metagenomic datasets to systematically evaluate how the choice of reference databases, from the compact Minikraken v1 to the expansive nt- and GTDB r202, and different CS (from 0 to 1.0) affect the key performance metrics of Kraken2. These metrics include classification rate, precision, recall, F1 score, and accuracy of true versus calculated bacterial abundance estimation. Our results show that higher CS, which increases the rigor of taxonomic classification by requiring greater k-mer agreement, generally decreases the classification rate. This effect is particularly pronounced for smaller databases such as Minikraken and Standard-16, where no reads could be classified when the CS was above 0.4. In contrast, for larger databases such as Standard, nt and GTDB r202, precision and F1 scores improved significantly with increasing CS, highlighting their robustness to stringent conditions. Recovery rates were mostly stable, indicating consistent detection of species under different CS settings. Crucially, the results show that a comprehensive reference database combined with a moderate CS (0.2 or 0.4) significantly improves classification accuracy and sensitivity. This finding underscores the need for careful selection of database and CS parameters tailored to specific scientific questions and available computational resources to optimize the results of metagenomic analyses.</p></div>","PeriodicalId":53135,"journal":{"name":"aBIOTECH","volume":"5 4","pages":"465 - 475"},"PeriodicalIF":4.6,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s42994-024-00178-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142789389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A simple, highly efficient Agrobacterium tumefaciens‐mediated moss transformation system with broad applications","authors":"Ping Zhou, Xiujin Liu, Yuqing Liang, Yan Zhang, Xiaoshuang Li, Daoyuan Zhang","doi":"10.1007/s42994-024-00174-4","DOIUrl":"10.1007/s42994-024-00174-4","url":null,"abstract":"<div><p>Mosses, particularly desiccation-tolerant (DT) species, are important model organisms for studying genes involved in plant development and stress resistance. The lack of a simple and efficient stable moss transformation system has hindered progress in deciphering the genetic mechanisms underlying traits of interest in these organisms. Here, we present an <i>Agrobacterium tumefaciens</i>-mediated transformation system for DT mosses that uses <i>Agrobacterium</i> strain EHA105 harboring the binary vector pCAMBIA1301-GUS. This system achieved transformation efficiencies of 74% and 81% in <i>Physcomitrium patens</i> and <i>Bryum argenteum</i> protonemata, respectively, without the need for culture and callus formation prior to regeneration. We detected GUS enzyme activity in the regenerated transgenic moss via histochemical staining. Southern blot, PCR, and RT-qPCR analyses confirmed the presence of the <i>GUS</i> gene. In addition, we successfully used this system to transform wild DT <i>Syntrichia caninervis</i>. Furthermore, <i>P. patens</i> and <i>B. argenteum</i> transformed using this system with the stress resistance gene <i>EsDREB</i> from the desert plant <i>Eremosparton songoricum</i> (Litv.) exhibited improved salt tolerance. We thus present an efficient tool for the genetic analysis of DT moss species, paving the way for the development of stress-resistant crop cultivars.</p></div>","PeriodicalId":53135,"journal":{"name":"aBIOTECH","volume":"5 4","pages":"476 - 487"},"PeriodicalIF":4.6,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s42994-024-00174-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141822289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inference and prioritization of tissue-specific regulons in Arabidopsis and Oryza","authors":"Honggang Dai, Yaxin Fan, Yichao Mei, Ling-Ling Chen, Junxiang Gao","doi":"10.1007/s42994-024-00176-2","DOIUrl":"10.1007/s42994-024-00176-2","url":null,"abstract":"<div><p>A regulon refers to a group of genes regulated by a transcription factor binding to regulatory motifs to achieve specific biological functions. To infer tissue-specific gene regulons in <i>Arabidopsis</i>, we developed a novel pipeline named InferReg. InferReg utilizes a gene expression matrix that includes 3400 <i>Arabidopsis</i> transcriptomes to make initial predictions about the regulatory relationships between transcription factors (TFs) and target genes (TGs) using co-expression patterns. It further improves these anticipated interactions by integrating TF binding site enrichment analysis to eliminate false positives that are only supported by expression data. InferReg further trained a graph convolutional network with 133 transcription factors, supported by ChIP-seq, as positive samples, to learn the regulatory logic between TFs and TGs to improve the accuracy of the regulatory network. To evaluate the functionality of InferReg, we utilized it to discover tissue-specific regulons in 5 <i>Arabidopsis</i> tissues: flower, leaf, root, seed, and seedling. We ranked the activities of regulons for each tissue based on reliability using Borda ranking and compared them with existing databases. The results demonstrated that InferReg not only identified known tissue-specific regulons but also discovered new ones. By applying InferReg to rice expression data, we were able to identify rice tissue-specific regulons, showing that our approach can be applied more broadly. We used InferReg to successfully identify important regulons in various tissues of <i>Arabidopsis</i> and <i>Oryza</i>, which has improved our understanding of tissue-specific regulations and the roles of regulons in tissue differentiation and development.</p></div>","PeriodicalId":53135,"journal":{"name":"aBIOTECH","volume":"5 3","pages":"309 - 324"},"PeriodicalIF":4.6,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141643629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}