Nicole Babineau, Le Thanh Dien Nguyen, Davis Mathieu, Clint McCue, Nicholas Schlecht, Taylor Abrahamson, Björn Hamberger, Lucas Busta
{"title":"A molecular representation system with a common reference frame for analyzing triterpenoid structural diversity.","authors":"Nicole Babineau, Le Thanh Dien Nguyen, Davis Mathieu, Clint McCue, Nicholas Schlecht, Taylor Abrahamson, Björn Hamberger, Lucas Busta","doi":"10.1016/j.xplc.2025.101320","DOIUrl":null,"url":null,"abstract":"<p><p>Researchers have uncovered hundreds of thousands of natural products, many of which contribute to medicine, materials, and agriculture. However, missing knowledge about the biosynthetic pathways of these products hinders their expanded use. Nucleotide sequencing is key to pathway elucidation efforts, and analyses of the molecular structures of natural products, although seldom discussed explicitly, also play an important role by suggesting hypothetical pathways for testing. Structural analyses are also important in drug discovery, for which many molecular representation systems-methods of representing molecular structures in a computer-friendly format-have been developed. Unfortunately, pathway elucidation investigations seldom use these representation systems. This gap likely occurs because those systems are primarily built to document molecular connectivity and topology rather than the absolute positions of bonds and atoms in a common reference frame, which would enable chemical structures to be connected with potential underlying biosynthetic steps. Here, we expand on recently developed skeleton-based molecular representation systems by implementing a common-reference-frame-oriented system. We tested this system using triterpenoid structures as a case study and explored its applications in biosynthesis and structural diversity tasks. The common-reference-frame system can identify structural regions of high or low variability on the scale of atoms and bonds and enable hierarchical clustering that is closely connected to underlying biosynthesis. Combined with information on phylogenetic distribution, the system illuminates distinct sources of structural variability, such as different enzyme families operating in the same pathway. These characteristics outline the potential of common-reference-frame molecular representation systems to support large-scale pathway elucidation efforts.</p>","PeriodicalId":52373,"journal":{"name":"Plant Communications","volume":" ","pages":"101320"},"PeriodicalIF":11.6000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12143141/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Communications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.xplc.2025.101320","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Researchers have uncovered hundreds of thousands of natural products, many of which contribute to medicine, materials, and agriculture. However, missing knowledge about the biosynthetic pathways of these products hinders their expanded use. Nucleotide sequencing is key to pathway elucidation efforts, and analyses of the molecular structures of natural products, although seldom discussed explicitly, also play an important role by suggesting hypothetical pathways for testing. Structural analyses are also important in drug discovery, for which many molecular representation systems-methods of representing molecular structures in a computer-friendly format-have been developed. Unfortunately, pathway elucidation investigations seldom use these representation systems. This gap likely occurs because those systems are primarily built to document molecular connectivity and topology rather than the absolute positions of bonds and atoms in a common reference frame, which would enable chemical structures to be connected with potential underlying biosynthetic steps. Here, we expand on recently developed skeleton-based molecular representation systems by implementing a common-reference-frame-oriented system. We tested this system using triterpenoid structures as a case study and explored its applications in biosynthesis and structural diversity tasks. The common-reference-frame system can identify structural regions of high or low variability on the scale of atoms and bonds and enable hierarchical clustering that is closely connected to underlying biosynthesis. Combined with information on phylogenetic distribution, the system illuminates distinct sources of structural variability, such as different enzyme families operating in the same pathway. These characteristics outline the potential of common-reference-frame molecular representation systems to support large-scale pathway elucidation efforts.
期刊介绍:
Plant Communications is an open access publishing platform that supports the global plant science community. It publishes original research, review articles, technical advances, and research resources in various areas of plant sciences. The scope of topics includes evolution, ecology, physiology, biochemistry, development, reproduction, metabolism, molecular and cellular biology, genetics, genomics, environmental interactions, biotechnology, breeding of higher and lower plants, and their interactions with other organisms. The goal of Plant Communications is to provide a high-quality platform for the dissemination of plant science research.