酶发现的计算机辅助序列注释（CASA）工作流程

IF 2.4 3区生物学 Q2 PLANT SCIENCES

Applications in Plant Sciences Pub Date : 2025-06-03 DOI:10.1002/aps3.70009

Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin

{"title":"酶发现的计算机辅助序列注释（CASA）工作流程","authors":"Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin","doi":"10.1002/aps3.70009","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Premise</h3>\n \n <p>With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We demonstrate CASA using one enzyme from the <i>Drosera capensis</i> genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.</p>\n </section>\n \n <section>\n \n <h3> Discussion</h3>\n \n <p>This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.</p>\n </section>\n </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 4","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70009","citationCount":"0","resultStr":"{\"title\":\"The Computer-Assisted Sequence Annotation (CASA) workflow for enzyme discovery\",\"authors\":\"Gemma R. Takahashi, Franchesca M. Cumpio, Carter T. Butts, Rachel W. Martin\",\"doi\":\"10.1002/aps3.70009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Premise</h3>\\n \\n <p>With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We demonstrate CASA using one enzyme from the <i>Drosera capensis</i> genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Discussion</h3>\\n \\n <p>This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.</p>\\n </section>\\n </div>\",\"PeriodicalId\":8022,\"journal\":{\"name\":\"Applications in Plant Sciences\",\"volume\":\"13 4\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70009\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applications in Plant Sciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://bsapubs.onlinelibrary.wiley.com/doi/10.1002/aps3.70009\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PLANT SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applications in Plant Sciences","FirstCategoryId":"99","ListUrlMain":"https://bsapubs.onlinelibrary.wiley.com/doi/10.1002/aps3.70009","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PLANT SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

随着廉价的核酸测序和基本功能水平的自动注释的出现，酶发现的中心问题不再是寻找活性序列，而是确定哪些序列适合进一步研究。这需要超越已知酶序列相似性的注释，并提供序列和结构级别的信息。在这里，我们介绍了一个工作流，用于从蛋白质序列数据中生成高信息量、丰富注释的序列比对。计算机辅助序列注释（CASA）是一种免费的基于python的工作流程，旨在自动化部分新蛋白质表征，同时产生人类可解释的最终输出。结果我们利用一种来自牛血清基因组的酶证明了CASA。工作流生成详细的注释，提供与已知参考序列的比较。除了序列相似性和预测功能外，还可以显示用户指定的特征，如活性位点残基、二硫键和底物结合残基，然后将这些特征与下游分析相结合，以获得对酶结构和功能的新见解。这项工作证明了详细注释和蛋白质结构预测在从核酸序列数据中选择生物化学或结构生物学蛋白质靶点方面的实用性。该工具链与说明和代表性示例一起免费提供。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

The Computer-Assisted Sequence Annotation (CASA) workflow for enzyme discovery

查看原文本刊更多论文

The Computer-Assisted Sequence Annotation (CASA) workflow for enzyme discovery

Premise

With the advent of inexpensive nucleic acid sequencing and automated annotation at the level of basic functionality, the central problem of enzyme discovery is no longer finding active sequences, it is determining which ones are suitable for further study. This requires annotation that goes beyond sequence similarity to known enzymes and provides information at the sequence and structural levels.

Methods

Here we introduce a workflow for generating highly informative, richly annotated sequence alignments from protein sequence data. Computer-Assisted Sequence Annotation (CASA) is a freely available Python-based workflow designed to automate portions of novel protein characterization, while producing a human-interpretable final output.

Results

We demonstrate CASA using one enzyme from the Drosera capensis genome. The workflow generates detailed annotations providing comparisons to known reference sequences. In addition to sequence similarity and predicted function, user-specified features such as active site residues, disulfide bonds, and substrate-binding residues can be displayed, and these can then be combined with downstream analyses to gain new insights into enzyme structure and function.

Discussion

This work demonstrates the utility of detailed annotations and protein structure prediction for choosing protein targets for biochemistry or structural biology from nucleic acid sequence data. The toolchain is freely available along with instructions and representative examples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applications in Plant Sciences PLANT SCIENCES-

CiteScore

7.30

自引率

0.00%

发文量

审稿时长

12 weeks

期刊介绍： Applications in Plant Sciences (APPS) is a monthly, peer-reviewed, open access journal promoting the rapid dissemination of newly developed, innovative tools and protocols in all areas of the plant sciences, including genetics, structure, function, development, evolution, systematics, and ecology. Given the rapid progress today in technology and its application in the plant sciences, the goal of APPS is to foster communication within the plant science community to advance scientific research. APPS is a publication of the Botanical Society of America, originating in 2009 as the American Journal of Botany''s online-only section, AJB Primer Notes & Protocols in the Plant Sciences. APPS publishes the following types of articles: (1) Protocol Notes describe new methods and technological advancements; (2) Genomic Resources Articles characterize the development and demonstrate the usefulness of newly developed genomic resources, including transcriptomes; (3) Software Notes detail new software applications; (4) Application Articles illustrate the application of a new protocol, method, or software application within the context of a larger study; (5) Review Articles evaluate available techniques, methods, or protocols; (6) Primer Notes report novel genetic markers with evidence of wide applicability.