B. Kindhi, M. A. Hendrawan, D. Purwitasari, T. A. Sardjono, M. Purnomo
{"title":"Distance-based pattern matching of DNA sequences for evaluating primary mutation","authors":"B. Kindhi, M. A. Hendrawan, D. Purwitasari, T. A. Sardjono, M. Purnomo","doi":"10.1109/ICITISEE.2017.8285518","DOIUrl":null,"url":null,"abstract":"String matching methods are often used to find out DNA pattern. However, basic string matching methods are unable to recognize the mutations case of viruses and bacteria. Distance-based Hamming method can accept character mismatches in an arrangement although it can give varied performance results depending on the number of compared patterns. We modify Hamming method to do pattern analysis of nucleotide arrangement in DNA that has primary Hepatitis C Virus (HCV) infection. We select HCV analysis because Indonesia showed the highest hepatitis case in Southeast Asia. Our experiments use DNA Hepatitis data from World Gen Bank and make comparisons to primary sequences from our partner institution. The problem we encountered while researching is the length of the HCV primary characters that are not always the same. This raises the hamming counting score to become unbalanced. The system we propose is to normalize the primary before being tested on isolate. The result of the normalization will be a constant and then summed with the hamming count. So the results of each hamming primary with each isolate can be balanced. The test results show that hamming method with modification able to give the distance between isolate and primary. The analysis of pattern matching results is similar to the condition of real primary. We purpose this modified hamming distance for analize virus or bacteria mutation, especially on HCV primary.","PeriodicalId":130873,"journal":{"name":"2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITISEE.2017.8285518","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
String matching methods are often used to find out DNA pattern. However, basic string matching methods are unable to recognize the mutations case of viruses and bacteria. Distance-based Hamming method can accept character mismatches in an arrangement although it can give varied performance results depending on the number of compared patterns. We modify Hamming method to do pattern analysis of nucleotide arrangement in DNA that has primary Hepatitis C Virus (HCV) infection. We select HCV analysis because Indonesia showed the highest hepatitis case in Southeast Asia. Our experiments use DNA Hepatitis data from World Gen Bank and make comparisons to primary sequences from our partner institution. The problem we encountered while researching is the length of the HCV primary characters that are not always the same. This raises the hamming counting score to become unbalanced. The system we propose is to normalize the primary before being tested on isolate. The result of the normalization will be a constant and then summed with the hamming count. So the results of each hamming primary with each isolate can be balanced. The test results show that hamming method with modification able to give the distance between isolate and primary. The analysis of pattern matching results is similar to the condition of real primary. We purpose this modified hamming distance for analize virus or bacteria mutation, especially on HCV primary.