{"title":"DLPAlign:一种基于深度学习的多蛋白质序列渐进比对方法","authors":"Mengmeng Kuang, Yong Liu, Lufei Gao","doi":"10.1145/3429210.3429221","DOIUrl":null,"url":null,"abstract":"This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method. We trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices. To evaluate this method, we have implemented a multiple sequence alignment tool called DLPAlign and compared its performance with eleven leading alignment methods on three empirical alignment benchmarks (BAliBASE, OXBench and SABMark). Our results show that DLPAlign can get the best total-column scores on the three benchmarks. When evaluated against the 711 low similarity families with average PID ≤ 30%, DLPAlign improved about 2.8% over the second-best MSA software. Besides, we compared the performance of DLPAlign and other alignment tools on a real-life application, namely protein secondary structure prediction on four protein sequences related to SARS-COV-2, and DLPAlign provides the best result in all cases.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"DLPAlign: A Deep Learning based Progressive Alignment Method for Multiple Protein Sequences\",\"authors\":\"Mengmeng Kuang, Yong Liu, Lufei Gao\",\"doi\":\"10.1145/3429210.3429221\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method. We trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices. To evaluate this method, we have implemented a multiple sequence alignment tool called DLPAlign and compared its performance with eleven leading alignment methods on three empirical alignment benchmarks (BAliBASE, OXBench and SABMark). Our results show that DLPAlign can get the best total-column scores on the three benchmarks. When evaluated against the 711 low similarity families with average PID ≤ 30%, DLPAlign improved about 2.8% over the second-best MSA software. Besides, we compared the performance of DLPAlign and other alignment tools on a real-life application, namely protein secondary structure prediction on four protein sequences related to SARS-COV-2, and DLPAlign provides the best result in all cases.\",\"PeriodicalId\":164790,\"journal\":{\"name\":\"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3429210.3429221\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3429210.3429221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DLPAlign: A Deep Learning based Progressive Alignment Method for Multiple Protein Sequences
This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method. We trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices. To evaluate this method, we have implemented a multiple sequence alignment tool called DLPAlign and compared its performance with eleven leading alignment methods on three empirical alignment benchmarks (BAliBASE, OXBench and SABMark). Our results show that DLPAlign can get the best total-column scores on the three benchmarks. When evaluated against the 711 low similarity families with average PID ≤ 30%, DLPAlign improved about 2.8% over the second-best MSA software. Besides, we compared the performance of DLPAlign and other alignment tools on a real-life application, namely protein secondary structure prediction on four protein sequences related to SARS-COV-2, and DLPAlign provides the best result in all cases.