{"title":"A new memetic algorithm for multiple graph alignment","authors":"Trần Ngọc Hà, Le Nhu Hien, H. X. Huan","doi":"10.25073/2588-1086/VNUCSCE.194","DOIUrl":null,"url":null,"abstract":"One of the main tasks of structural biology is comparing the structure of proteins. Comparisons of protein structure can determine their functional similarities. Multigraph alignment is a useful tool for identifying functional similarities based on structural analysis. This article proposes a new algorithm for aligning protein binding sites called ACOTS-MGA. This algorithm is based on the memetic scheme. It uses the ACO method to construct a set of solutions, then selects the best solution for implementing Tabu Search to improve the solution quality. Experimental results have shown that ACOTS-MGA outperforms state-of-the-art algorithms while producing alignments of better quality.KeywordsMultiple Graph Alignment, Tabu Search, Ant Colony Optimization, local search, memetic algorithm, SMMAS pheromone update rule, protein active sitesReferencesE. Todd, C. A. Orengo, and J. M. Thornton, “Evolution of function in protein superfamilies, from a structural perspective,” J. Mol. Biol., vol. 307, no. 4, pp. 1113–1143, Apr. 2001.S. F. Altschul et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Res., vol. 25, pp. 3389–3402, 1997.R. C. Edgar, “MUSCLE: multiple sequence alignment with high accuracy and high throughput,” Nucleic Acids Res., vol. 32, no. 5, pp. 1792–1797, Mar. 2004.J. D. Thompson, D. G. Higgins, and T. J. Gibson, “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Res., vol. 22, no. 22, pp. 4673–4680, Nov. 1994.M. Larkin, G. Blackshields, N. Brown, … R. C.-, and undefined 2007, “Clustal W and Clustal X version 2.0,” academic.oup.com.C. Notredame, D. G. Higgins, and J. Heringa, “T-coffee: a novel method for fast and accurate multiple sequence alignment,” J. Mol. Biol., vol. 302, no. 1, pp. 205–217, Sep. 2000.K. Sjolander, “Phylogenomic inference of protein molecular function: advances and challenges,” Bioinformatics, vol. 20, no. 2, pp. 170–179, Jan. 2004.T. Fober, M. Mernberger, G. Klebe, and E. Hüllermeier, “Evolutionary construction of multiple graph alignments for the structural analysis of biomolecules,” Bioinformatics, vol. 25, no. 16, pp. 2110–2117, 2009.M. Mernberger, G. Klebe, and E. Hullermeier, “SEGA: Semiglobal Graph Alignment for Structure-Based Protein Comparison,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 8, no. 5, pp. 1330–1343, Sep. 2011.D. Shasha, J. T. L. Wang, and R. Giugno, “Algorithmics and applications of tree and graph searching,” in Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS ’02, 2002, p. 39.R. V. Spriggs, P. J. Artymiuk, and P. Willett, “Searching for Patterns of Amino Acids in 3D Protein Structures,” J. Chem. Inf. Comput. Sci., vol. 43, no. 2, pp. 412–421, Mar. 2003.D. Conte, P. Foggia, C. Sansone, And M. Vento, “Thirty years of graph matching in pattern recognition,” Int. J. Pattern Recognit. Artif. Intell., vol. 18, no. 3, pp. 265–298, May 2004.K. Kinoshita and H. Nakamura, “Identification of the ligand binding sites on the molecular surface of proteins,” Protein Sci., vol. 14, no. 3, pp. 711–718, Mar. 2005.O. Kuchaiev and N. Pržulj, “Integrative network alignment reveals large regions of global network similarity in yeast and human,” Bioinformatics, vol. 27, 2011.Xifeng Yan, Feida Zhu, Jiawei Han, and P. S. Yu, “Searching Substructures with Superimposed Distance,” in 22nd International Conference on Data Engineering (ICDE’06), 2006, pp. 88–88.X. Yan, P. S. Yu, and J. Han, “Substructure similarity search in graph databases,” in Proceedings of the 2005 ACM SIGMOD international conference on Management of data - SIGMOD ’05, 2005, p. 766.S. Zhang, M. Hu, and J. Yang, “TreePi: A Novel Graph Indexing Method,” in 2007 IEEE 23rd International Conference on Data Engineering, 2007, pp. 966–975.A. E. Aladag and C. Erten, “SPINAL: scalable protein interaction network alignment,” Bioinformatics, vol. 29, pp. 917–924, 2013.S. Schmitt, D. Kuhn, and G. Klebe, “A New Method to Detect Related Function Among Proteins Independent of Sequence and Fold Homology,” J. Mol. Biol., vol. 323, no. 2, pp. 387–406, Oct. 2002.M. Hendlich, A. Bergner, J. Günther, and G. Klebe, “Relibase: Design and Development of a Database for Comprehensive Analysis of Protein–Ligand Interactions,” J. Mol. Biol., vol. 326, no. 2, pp. 607–620, Feb. 2003.N. Weskamp, E. Hüllermeier, D. Kuhn, and G. Klebe, “Multiple graph alignment for the structural analysis of protein active sites,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 4, no. 2, pp. 310–320, 2007.T. N. Ha, D. D. Dong, and H. X. Huan, “An efficient ant colony optimization algorithm for Multiple Graph Alignment,” in 2013 International Conference on Computing, Management and Telecommunications (ComManTel), 2013, pp. 386–391. F. Neri, Handbook of memetic algorithms, vol. 379. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.M. Gong, Z. Peng, L. Ma, and J. Huang, “Global Biological Network Alignment by Using Efficient Memetic Algorithm,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 13, no. 6, pp. 1117–1129, Nov. 2016.J. M. Caldonazzo Garbelini, A. Y. Kashiwabara, and D. S. Sanches, “Sequence motif finder using memetic algorithm,” BMC Bioinformatics, vol. 19, 2018. L. Correa, B. Borguesan, C. Farfan, M. Inostroza-Ponta, and M. Dorn, “A Memetic Algorithm for 3-D Protein Structure Prediction Problem,” IEEE/ACM Trans. Comput. Biol. Bioinforma., pp. 1–1, 2016.H. Tran Ngoc, D. Do Duc, and H. Hoang Xuan, “A novel ant based algorithm for multiple graph alignment,” in 2014 International Conference on Advanced Technologies for Communications (ATC 2014), 2014, pp. 181–186. H. X. Huan, N. Linh-Trung, H.-T. Huynh, and others, “Solving the Traveling Salesman Problem with Ant Colony Optimization: A Revisit and New Efficient Algorithms,” REV J. Electron. Commun., vol. 2, no. 3–4, 2013. D. Do Duc, H. Q. Dinh, and H. Hoang Xuan, “On the Pheromone Update Rules of Ant Colony Optimization Approaches for the Job Shop Scheduling Problem,” 2008, pp. 153-160.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"369 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VNU Journal of Science: Computer Science and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25073/2588-1086/VNUCSCE.194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
One of the main tasks of structural biology is comparing the structure of proteins. Comparisons of protein structure can determine their functional similarities. Multigraph alignment is a useful tool for identifying functional similarities based on structural analysis. This article proposes a new algorithm for aligning protein binding sites called ACOTS-MGA. This algorithm is based on the memetic scheme. It uses the ACO method to construct a set of solutions, then selects the best solution for implementing Tabu Search to improve the solution quality. Experimental results have shown that ACOTS-MGA outperforms state-of-the-art algorithms while producing alignments of better quality.KeywordsMultiple Graph Alignment, Tabu Search, Ant Colony Optimization, local search, memetic algorithm, SMMAS pheromone update rule, protein active sitesReferencesE. Todd, C. A. Orengo, and J. M. Thornton, “Evolution of function in protein superfamilies, from a structural perspective,” J. Mol. Biol., vol. 307, no. 4, pp. 1113–1143, Apr. 2001.S. F. Altschul et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Res., vol. 25, pp. 3389–3402, 1997.R. C. Edgar, “MUSCLE: multiple sequence alignment with high accuracy and high throughput,” Nucleic Acids Res., vol. 32, no. 5, pp. 1792–1797, Mar. 2004.J. D. Thompson, D. G. Higgins, and T. J. Gibson, “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Res., vol. 22, no. 22, pp. 4673–4680, Nov. 1994.M. Larkin, G. Blackshields, N. Brown, … R. C.-, and undefined 2007, “Clustal W and Clustal X version 2.0,” academic.oup.com.C. Notredame, D. G. Higgins, and J. Heringa, “T-coffee: a novel method for fast and accurate multiple sequence alignment,” J. Mol. Biol., vol. 302, no. 1, pp. 205–217, Sep. 2000.K. Sjolander, “Phylogenomic inference of protein molecular function: advances and challenges,” Bioinformatics, vol. 20, no. 2, pp. 170–179, Jan. 2004.T. Fober, M. Mernberger, G. Klebe, and E. Hüllermeier, “Evolutionary construction of multiple graph alignments for the structural analysis of biomolecules,” Bioinformatics, vol. 25, no. 16, pp. 2110–2117, 2009.M. Mernberger, G. Klebe, and E. Hullermeier, “SEGA: Semiglobal Graph Alignment for Structure-Based Protein Comparison,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 8, no. 5, pp. 1330–1343, Sep. 2011.D. Shasha, J. T. L. Wang, and R. Giugno, “Algorithmics and applications of tree and graph searching,” in Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS ’02, 2002, p. 39.R. V. Spriggs, P. J. Artymiuk, and P. Willett, “Searching for Patterns of Amino Acids in 3D Protein Structures,” J. Chem. Inf. Comput. Sci., vol. 43, no. 2, pp. 412–421, Mar. 2003.D. Conte, P. Foggia, C. Sansone, And M. Vento, “Thirty years of graph matching in pattern recognition,” Int. J. Pattern Recognit. Artif. Intell., vol. 18, no. 3, pp. 265–298, May 2004.K. Kinoshita and H. Nakamura, “Identification of the ligand binding sites on the molecular surface of proteins,” Protein Sci., vol. 14, no. 3, pp. 711–718, Mar. 2005.O. Kuchaiev and N. Pržulj, “Integrative network alignment reveals large regions of global network similarity in yeast and human,” Bioinformatics, vol. 27, 2011.Xifeng Yan, Feida Zhu, Jiawei Han, and P. S. Yu, “Searching Substructures with Superimposed Distance,” in 22nd International Conference on Data Engineering (ICDE’06), 2006, pp. 88–88.X. Yan, P. S. Yu, and J. Han, “Substructure similarity search in graph databases,” in Proceedings of the 2005 ACM SIGMOD international conference on Management of data - SIGMOD ’05, 2005, p. 766.S. Zhang, M. Hu, and J. Yang, “TreePi: A Novel Graph Indexing Method,” in 2007 IEEE 23rd International Conference on Data Engineering, 2007, pp. 966–975.A. E. Aladag and C. Erten, “SPINAL: scalable protein interaction network alignment,” Bioinformatics, vol. 29, pp. 917–924, 2013.S. Schmitt, D. Kuhn, and G. Klebe, “A New Method to Detect Related Function Among Proteins Independent of Sequence and Fold Homology,” J. Mol. Biol., vol. 323, no. 2, pp. 387–406, Oct. 2002.M. Hendlich, A. Bergner, J. Günther, and G. Klebe, “Relibase: Design and Development of a Database for Comprehensive Analysis of Protein–Ligand Interactions,” J. Mol. Biol., vol. 326, no. 2, pp. 607–620, Feb. 2003.N. Weskamp, E. Hüllermeier, D. Kuhn, and G. Klebe, “Multiple graph alignment for the structural analysis of protein active sites,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 4, no. 2, pp. 310–320, 2007.T. N. Ha, D. D. Dong, and H. X. Huan, “An efficient ant colony optimization algorithm for Multiple Graph Alignment,” in 2013 International Conference on Computing, Management and Telecommunications (ComManTel), 2013, pp. 386–391. F. Neri, Handbook of memetic algorithms, vol. 379. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.M. Gong, Z. Peng, L. Ma, and J. Huang, “Global Biological Network Alignment by Using Efficient Memetic Algorithm,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 13, no. 6, pp. 1117–1129, Nov. 2016.J. M. Caldonazzo Garbelini, A. Y. Kashiwabara, and D. S. Sanches, “Sequence motif finder using memetic algorithm,” BMC Bioinformatics, vol. 19, 2018. L. Correa, B. Borguesan, C. Farfan, M. Inostroza-Ponta, and M. Dorn, “A Memetic Algorithm for 3-D Protein Structure Prediction Problem,” IEEE/ACM Trans. Comput. Biol. Bioinforma., pp. 1–1, 2016.H. Tran Ngoc, D. Do Duc, and H. Hoang Xuan, “A novel ant based algorithm for multiple graph alignment,” in 2014 International Conference on Advanced Technologies for Communications (ATC 2014), 2014, pp. 181–186. H. X. Huan, N. Linh-Trung, H.-T. Huynh, and others, “Solving the Traveling Salesman Problem with Ant Colony Optimization: A Revisit and New Efficient Algorithms,” REV J. Electron. Commun., vol. 2, no. 3–4, 2013. D. Do Duc, H. Q. Dinh, and H. Hoang Xuan, “On the Pheromone Update Rules of Ant Colony Optimization Approaches for the Job Shop Scheduling Problem,” 2008, pp. 153-160.
柏林,海德堡:施普林格柏林,海德堡,2011。龚志强,彭志强,“基于Memetic算法的全球生物网络定位”,中国科学院学报(自然科学版)。第一版。医学杂志。Bioinforma。,第13卷,第3期。6, pp. 1117-1129, 2016年11月。M. Caldonazzo Garbelini, A. Y. Kashiwabara和D. S. Sanches,“基于模因算法的序列基序查找器”,BMC生物信息学,第19卷,2018。L. Correa, B. Borguesan, C. Farfan, M. Inostroza-Ponta, M. Dorn,“三维蛋白质结构预测问题的模因算法”,IEEE/ACM Trans。第一版。医学杂志。Bioinforma。, pp. 1-1, 2016。Tran Ngoc, D. Do Duc和H. Hoang Xuan,“一种新的基于蚂蚁的多图对齐算法”,2014年国际先进通信技术会议(ATC 2014), 2014, pp. 181-186。黄洪祥,黄洪涛,“基于蚁群优化的旅行商问题求解:一种新的高效算法”,电子学报。Commun。,第2卷,第2期。3 - 4, 2013年。杜德德,丁洪清,黄轩,“蚁群优化方法的信息素更新规则研究”,2008,pp. 153-160。