{"title":"PTF-Vāc: An explainable and generative deep co-learning encoders-decoders system for ab-initio discovery of plant transcription factor binding sites.","authors":"Sagar Gupta, Jyoti, Umesh Bhati, Veerbhan Kesarwani, Akanksha Sharma, Ravi Shankar","doi":"10.1016/j.xplc.2025.101543","DOIUrl":null,"url":null,"abstract":"<p><p>Discovery of transcription factors (TFs) binding sites (TFBS) and their motifs in plants pose significant challenges due to high cross-species variability. The interaction between TFs and their binding sites is highly specific and context dependent. Most of the existing TFBS finding tools are not accurate enough to discover these binding sites in plants. They fail to capture the cross-species variability, interdependence between TF structure and its TFBS, and context specificity of binding. Since they are coupled to predefined TF specific model/matrix, they are highly vulnerable towards the volume and quality of data provided to build the motifs. All these software make a presumption that the user input would be specific to any particular TF which renders them of very limited use for practical applications like genomic annotations of newly sequenced species. Here, we report an explainable Deep Encoders-Decoders generative system, PTF-Vāc, founded on a universal model of deep co-learning on variability in binding sites and TF structure, PTFSpot, making it completely free from the bottlenecks mentioned above. It has successfully decoupled the process of TFBS discovery from the prior step of motif finding and requirement of TF specific motif models. Due to the universal model for TF:DNA interactions as its guide, it can discover the binding motifs in total independence from data volume, species and TF specific models. In a comprehensive benchmarking study across a huge volume of experimental data, it has outperformed most advanced motif finding deep learning (DL) algorithms. With this all, PTF-Vāc brings a completely new chapter in ab-initio TFBS discovery through generative AI.</p>","PeriodicalId":52373,"journal":{"name":"Plant Communications","volume":" ","pages":"101543"},"PeriodicalIF":11.6000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Communications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.xplc.2025.101543","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Discovery of transcription factors (TFs) binding sites (TFBS) and their motifs in plants pose significant challenges due to high cross-species variability. The interaction between TFs and their binding sites is highly specific and context dependent. Most of the existing TFBS finding tools are not accurate enough to discover these binding sites in plants. They fail to capture the cross-species variability, interdependence between TF structure and its TFBS, and context specificity of binding. Since they are coupled to predefined TF specific model/matrix, they are highly vulnerable towards the volume and quality of data provided to build the motifs. All these software make a presumption that the user input would be specific to any particular TF which renders them of very limited use for practical applications like genomic annotations of newly sequenced species. Here, we report an explainable Deep Encoders-Decoders generative system, PTF-Vāc, founded on a universal model of deep co-learning on variability in binding sites and TF structure, PTFSpot, making it completely free from the bottlenecks mentioned above. It has successfully decoupled the process of TFBS discovery from the prior step of motif finding and requirement of TF specific motif models. Due to the universal model for TF:DNA interactions as its guide, it can discover the binding motifs in total independence from data volume, species and TF specific models. In a comprehensive benchmarking study across a huge volume of experimental data, it has outperformed most advanced motif finding deep learning (DL) algorithms. With this all, PTF-Vāc brings a completely new chapter in ab-initio TFBS discovery through generative AI.
期刊介绍:
Plant Communications is an open access publishing platform that supports the global plant science community. It publishes original research, review articles, technical advances, and research resources in various areas of plant sciences. The scope of topics includes evolution, ecology, physiology, biochemistry, development, reproduction, metabolism, molecular and cellular biology, genetics, genomics, environmental interactions, biotechnology, breeding of higher and lower plants, and their interactions with other organisms. The goal of Plant Communications is to provide a high-quality platform for the dissemination of plant science research.