Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin
{"title":"酶发现的计算机辅助序列注释(CASA)工作流程","authors":"Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin","doi":"10.1002/aps3.70009","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Premise</h3>\n \n <p>With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We demonstrate CASA using one enzyme from the <i>Drosera capensis</i> genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.</p>\n </section>\n \n <section>\n \n <h3> Discussion</h3>\n \n <p>This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.</p>\n </section>\n </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 4","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70009","citationCount":"0","resultStr":"{\"title\":\"The Computer-Assisted Sequence Annotation (CASA) workflow for enzyme discovery\",\"authors\":\"Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin\",\"doi\":\"10.1002/aps3.70009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Premise</h3>\\n \\n <p>With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We demonstrate CASA using one enzyme from the <i>Drosera capensis</i> genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Discussion</h3>\\n \\n <p>This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.</p>\\n </section>\\n </div>\",\"PeriodicalId\":8022,\"journal\":{\"name\":\"Applications in Plant Sciences\",\"volume\":\"13 4\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70009\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applications in Plant Sciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://bsapubs.onlinelibrary.wiley.com/doi/10.1002/aps3.70009\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PLANT SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applications in Plant Sciences","FirstCategoryId":"99","ListUrlMain":"https://bsapubs.onlinelibrary.wiley.com/doi/10.1002/aps3.70009","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
The Computer-Assisted Sequence Annotation (CASA) workflow for enzyme discovery
Premise
With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.
Methods
Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.
Results
We demonstrate CASA using one enzyme from the Drosera capensis genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.
Discussion
This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.
期刊介绍:
Applications in Plant Sciences (APPS) is a monthly, peer-reviewed, open access journal promoting the rapid dissemination of newly developed, innovative tools and protocols in all areas of the plant sciences, including genetics, structure, function, development, evolution, systematics, and ecology. Given the rapid progress today in technology and its application in the plant sciences, the goal of APPS is to foster communication within the plant science community to advance scientific research. APPS is a publication of the Botanical Society of America, originating in 2009 as the American Journal of Botany''s online-only section, AJB Primer Notes & Protocols in the Plant Sciences.
APPS publishes the following types of articles: (1) Protocol Notes describe new methods and technological advancements; (2) Genomic Resources Articles characterize the development and demonstrate the usefulness of newly developed genomic resources, including transcriptomes; (3) Software Notes detail new software applications; (4) Application Articles illustrate the application of a new protocol, method, or software application within the context of a larger study; (5) Review Articles evaluate available techniques, methods, or protocols; (6) Primer Notes report novel genetic markers with evidence of wide applicability.