通过来自不完整基因组的候选同源簇自动注释蛋白质功能

2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05) Pub Date : 2005-08-08 DOI:10.1109/CSBW.2005.27

A. Vashist, C. Kulikowski, I. Muchnik

{"title":"通过来自不完整基因组的候选同源簇自动注释蛋白质功能","authors":"A. Vashist, C. Kulikowski, I. Muchnik","doi":"10.1109/CSBW.2005.27","DOIUrl":null,"url":null,"abstract":"Annotation of protein function often arises in the context of partially complete genomes but is not adequately addressed. We present an annotation method by extracting ortholog clusters from incomplete genomes that are evolutionary closely related to the genome of interest. To construct clusters, our method focuses on sequence similarities across genomes rather than similarities between sequences within a genome. We use the quasi-concave set function optimization for extracting the ortholog clusters as extreme groups of sequences such that similarity of the least similar sequence in this group is maximum. A protein sequence is annotated with the ortholog cluster whose average similarity is highest. We have applied this method for annotating the Rice proteome based on clusters constructed on four partially complete cereal proteomes and the complete proteome from Arabidopsis.","PeriodicalId":123531,"journal":{"name":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automatic protein function annotation through candidate ortholog clusters from incomplete genomes\",\"authors\":\"A. Vashist, C. Kulikowski, I. Muchnik\",\"doi\":\"10.1109/CSBW.2005.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Annotation of protein function often arises in the context of partially complete genomes but is not adequately addressed. We present an annotation method by extracting ortholog clusters from incomplete genomes that are evolutionary closely related to the genome of interest. To construct clusters, our method focuses on sequence similarities across genomes rather than similarities between sequences within a genome. We use the quasi-concave set function optimization for extracting the ortholog clusters as extreme groups of sequences such that similarity of the least similar sequence in this group is maximum. A protein sequence is annotated with the ortholog cluster whose average similarity is highest. We have applied this method for annotating the Rice proteome based on clusters constructed on four partially complete cereal proteomes and the complete proteome from Arabidopsis.\",\"PeriodicalId\":123531,\"journal\":{\"name\":\"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSBW.2005.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSBW.2005.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

蛋白质功能的注释通常出现在部分完整基因组的背景下，但没有得到充分的解决。我们提出了一种通过从与感兴趣的基因组进化密切相关的不完整基因组中提取同源聚类的注释方法。为了构建聚类，我们的方法侧重于基因组之间的序列相似性，而不是基因组内序列之间的相似性。我们利用拟凹集函数优化方法提取了作为序列极值群的正交聚类，使得该极值群中相似性最小的序列相似性最大。用平均相似度最高的同源聚类来标注蛋白质序列。我们基于4个部分完整的谷类蛋白质组和拟南芥完整蛋白质组构建的聚类，应用该方法对水稻蛋白质组进行了注释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic protein function annotation through candidate ortholog clusters from incomplete genomes

Annotation of protein function often arises in the context of partially complete genomes but is not adequately addressed. We present an annotation method by extracting ortholog clusters from incomplete genomes that are evolutionary closely related to the genome of interest. To construct clusters, our method focuses on sequence similarities across genomes rather than similarities between sequences within a genome. We use the quasi-concave set function optimization for extracting the ortholog clusters as extreme groups of sequences such that similarity of the least similar sequence in this group is maximum. A protein sequence is annotated with the ortholog cluster whose average similarity is highest. We have applied this method for annotating the Rice proteome based on clusters constructed on four partially complete cereal proteomes and the complete proteome from Arabidopsis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)

自引率

0.00%

发文量