{"title":"Genomic Comparison of Four Metapneumovirus Strains Using Decision Tree, Apriori Algorithm, ClustalW, and Phylogenetic Reconstruction","authors":"Sang-Ran Lim, Taeseon Yoon","doi":"10.1145/3309129.3309130","DOIUrl":null,"url":null,"abstract":"Human metapneumovirus has persistently been the leading causative agent of acute respiratory infections in young children and the elderly worldwide. The respiratory tract illness caused by HMPV yields fatal levels of morbidity and mortality rate in young children under five and the immunocompromised. To study the genetic structure of HMPV, this paper conducts a genomic analysis of the nine genes (N, P, M, F, M2-1, M2-2, SH, G, and L) of human metapneumovirus subtype A1, A2, B1, and B2. Through multiple sequence alignments, decision tree, Apriori algorithm, and phylogenetic reconstruction, this paper investigates the genome-wise discrepancy and the protein-wise discrepancy between different HMPV strains. The results of the experiment indicate that the four HMPV subtypes show high similarity while displaying distinct attributes. The role of glycoprotein (G) and small hydrophobic protein (SH) are found to display the most variance among the four subtypes. The Apriori algorithm shows that amino acid serine and lysine are the most frequent among the four subtypes. Under Apriori algorithm 19 window, it has been found that the four subtypes display some degree of similarity in terms of their frequencies of the amino acid lysine(K). On the other hand, two clades of HMPV seem to split in terms of their frequencies of the amino acid serine(S). Hence, the role of glycoprotein and small hydrophobic protein and the contribution of amino acids serine and lysine to the nine polypeptides are suggested as a future research.","PeriodicalId":326530,"journal":{"name":"Proceedings of the 5th International Conference on Bioinformatics Research and Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Bioinformatics Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3309129.3309130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Human metapneumovirus has persistently been the leading causative agent of acute respiratory infections in young children and the elderly worldwide. The respiratory tract illness caused by HMPV yields fatal levels of morbidity and mortality rate in young children under five and the immunocompromised. To study the genetic structure of HMPV, this paper conducts a genomic analysis of the nine genes (N, P, M, F, M2-1, M2-2, SH, G, and L) of human metapneumovirus subtype A1, A2, B1, and B2. Through multiple sequence alignments, decision tree, Apriori algorithm, and phylogenetic reconstruction, this paper investigates the genome-wise discrepancy and the protein-wise discrepancy between different HMPV strains. The results of the experiment indicate that the four HMPV subtypes show high similarity while displaying distinct attributes. The role of glycoprotein (G) and small hydrophobic protein (SH) are found to display the most variance among the four subtypes. The Apriori algorithm shows that amino acid serine and lysine are the most frequent among the four subtypes. Under Apriori algorithm 19 window, it has been found that the four subtypes display some degree of similarity in terms of their frequencies of the amino acid lysine(K). On the other hand, two clades of HMPV seem to split in terms of their frequencies of the amino acid serine(S). Hence, the role of glycoprotein and small hydrophobic protein and the contribution of amino acids serine and lysine to the nine polypeptides are suggested as a future research.