{"title":"基于支持向量机的光合作用特异性基因组特征识别算法","authors":"Gong-Xin Yu, G. Ostrouchov, A. Geist, N. Samatova","doi":"10.1109/CSB.2003.1227323","DOIUrl":null,"url":null,"abstract":"This paper presents a novel algorithm for identification and functional characterization of \"key\" genome features responsible for a particular biochemical process of interest. The central idea is that individual genome features are identified as \"key\" features if the discrimination accuracy between two classes of genomes with respect to a given biochemical process is sufficiently affected by the inclusion or exclusion of these features. In this paper, genome features are defined by high-resolution gene functions. The discrimination procedure utilizes the support vector machine classification technique. The application to the oxygenic photosynthetic process resulted in 126 highly confident candidate genome features. While many of these features are well-known components in the oxygenic photosynthetic process, others are completely unknown, even including some hypothetical proteins. It is obvious that our algorithm is capable of discovering features related to a targeted biochemical process.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"An SVM-based algorithm for identification of photosynthesis-specific genome features\",\"authors\":\"Gong-Xin Yu, G. Ostrouchov, A. Geist, N. Samatova\",\"doi\":\"10.1109/CSB.2003.1227323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel algorithm for identification and functional characterization of \\\"key\\\" genome features responsible for a particular biochemical process of interest. The central idea is that individual genome features are identified as \\\"key\\\" features if the discrimination accuracy between two classes of genomes with respect to a given biochemical process is sufficiently affected by the inclusion or exclusion of these features. In this paper, genome features are defined by high-resolution gene functions. The discrimination procedure utilizes the support vector machine classification technique. The application to the oxygenic photosynthetic process resulted in 126 highly confident candidate genome features. While many of these features are well-known components in the oxygenic photosynthetic process, others are completely unknown, even including some hypothetical proteins. It is obvious that our algorithm is capable of discovering features related to a targeted biochemical process.\",\"PeriodicalId\":147883,\"journal\":{\"name\":\"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSB.2003.1227323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSB.2003.1227323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An SVM-based algorithm for identification of photosynthesis-specific genome features
This paper presents a novel algorithm for identification and functional characterization of "key" genome features responsible for a particular biochemical process of interest. The central idea is that individual genome features are identified as "key" features if the discrimination accuracy between two classes of genomes with respect to a given biochemical process is sufficiently affected by the inclusion or exclusion of these features. In this paper, genome features are defined by high-resolution gene functions. The discrimination procedure utilizes the support vector machine classification technique. The application to the oxygenic photosynthetic process resulted in 126 highly confident candidate genome features. While many of these features are well-known components in the oxygenic photosynthetic process, others are completely unknown, even including some hypothetical proteins. It is obvious that our algorithm is capable of discovering features related to a targeted biochemical process.