{"title":"Design of an incremental clustering package for protein function and family analysis","authors":"Chien-Yu Chen, Hsueh‐Fen Juan, Po-Jen Hsiao, Shui-Tein Chen, Hsiang-Wen Tseng, Yen-Jen Oyang","doi":"10.1109/MMSE.2003.1254454","DOIUrl":null,"url":null,"abstract":"Protein clustering has been widely exploited to facilitate in-depth analysis of protein functions and families. We discuss the design of an incremental protein clustering package that provides comprehensive features for protein function and family analysis. Specifically, the package offers alternative options for carrying out high-quality protein clustering from different aspects. The incremental nature of the clustering algorithm is essential for efficient analysis of those contemporary protein databases whose sizes are growing rapidly. Concerning the quality of clustering results, experimental results from applying the incremental clustering algorithm to protein sequence analysis show that the incremental algorithm is able to identify protein sequence clusters that match protein families more consistently than the single-link algorithm, which is the most widely used hierarchical clustering algorithm for protein sequence analysis. We also address the implementation techniques employed to improve the system performance.","PeriodicalId":322357,"journal":{"name":"Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSE.2003.1254454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein clustering has been widely exploited to facilitate in-depth analysis of protein functions and families. We discuss the design of an incremental protein clustering package that provides comprehensive features for protein function and family analysis. Specifically, the package offers alternative options for carrying out high-quality protein clustering from different aspects. The incremental nature of the clustering algorithm is essential for efficient analysis of those contemporary protein databases whose sizes are growing rapidly. Concerning the quality of clustering results, experimental results from applying the incremental clustering algorithm to protein sequence analysis show that the incremental algorithm is able to identify protein sequence clusters that match protein families more consistently than the single-link algorithm, which is the most widely used hierarchical clustering algorithm for protein sequence analysis. We also address the implementation techniques employed to improve the system performance.