K. Srinivasa, M. Jagadish, S. Prashanth, K. Venugopal, L. Patnaik
{"title":"Exploring Structurally Similar Protein Sequence Motifs using Relative-Distance Measures","authors":"K. Srinivasa, M. Jagadish, S. Prashanth, K. Venugopal, L. Patnaik","doi":"10.1109/ICISIP.2006.4286077","DOIUrl":null,"url":null,"abstract":"Protein sequence motifs are short conserved subsequences common to related protein sequences. Information about motifs is extremely important to the study of biologically significant conserved regions in protein families. These conserved regions can determine the functions and conformation of proteins. Conventionally, recurring patterns of proteins are explored using short protein segments and classification based on similarity measures between the segments. Two protein sequences are classified into the same class if they have high homology in terms of feature patterns extracted through sequence alignment algorithms. Such methodology focuses on finding position specific motifs only. In this paper, we propose a new algorithm to explore protein sequences by studying subsequences with relative-positioning of amino acids followed by K-Means clustering of fixed-sized segments. The dataset used for our work is most updated among studies for sequence motifs. The various biochemical tests that are found in literature are used to test the significance of motifs and these tests show that motifs generated are of both structural and functional interest. The results suggest that this method may also be applied to closely-related area of finding DNA motifs.","PeriodicalId":187104,"journal":{"name":"2006 Fourth International Conference on Intelligent Sensing and Information Processing","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 Fourth International Conference on Intelligent Sensing and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISIP.2006.4286077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein sequence motifs are short conserved subsequences common to related protein sequences. Information about motifs is extremely important to the study of biologically significant conserved regions in protein families. These conserved regions can determine the functions and conformation of proteins. Conventionally, recurring patterns of proteins are explored using short protein segments and classification based on similarity measures between the segments. Two protein sequences are classified into the same class if they have high homology in terms of feature patterns extracted through sequence alignment algorithms. Such methodology focuses on finding position specific motifs only. In this paper, we propose a new algorithm to explore protein sequences by studying subsequences with relative-positioning of amino acids followed by K-Means clustering of fixed-sized segments. The dataset used for our work is most updated among studies for sequence motifs. The various biochemical tests that are found in literature are used to test the significance of motifs and these tests show that motifs generated are of both structural and functional interest. The results suggest that this method may also be applied to closely-related area of finding DNA motifs.