{"title":"基于kmer的序列表示,用于快速检索和比较","authors":"Zutao Wu","doi":"10.5204/thesis.eprints.103083","DOIUrl":null,"url":null,"abstract":"This thesis presents a study of alignment-free methods for genetic sequence comparison. By using representations based on k-mers – short subsequences of length k - sequence similarity can be measured rapidly and accurately by calculating the distance between these paired representations. This research utilises and adapts conventional methods from information retrieval to generate novel representations for k-mers and sequence fragments. Precision was further improved through the use of machine learning approaches - especially neural networks - to learn relationships between k-mers and to generate enhanced sequence representations. These approaches have applications in large scale sequence comparison, especially in the analysis of metagenomic samples.","PeriodicalId":21486,"journal":{"name":"Science & Engineering Faculty","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Kmer-based sequence representations for fast retrieval and comparison\",\"authors\":\"Zutao Wu\",\"doi\":\"10.5204/thesis.eprints.103083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This thesis presents a study of alignment-free methods for genetic sequence comparison. By using representations based on k-mers – short subsequences of length k - sequence similarity can be measured rapidly and accurately by calculating the distance between these paired representations. This research utilises and adapts conventional methods from information retrieval to generate novel representations for k-mers and sequence fragments. Precision was further improved through the use of machine learning approaches - especially neural networks - to learn relationships between k-mers and to generate enhanced sequence representations. These approaches have applications in large scale sequence comparison, especially in the analysis of metagenomic samples.\",\"PeriodicalId\":21486,\"journal\":{\"name\":\"Science & Engineering Faculty\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science & Engineering Faculty\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5204/thesis.eprints.103083\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science & Engineering Faculty","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5204/thesis.eprints.103083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Kmer-based sequence representations for fast retrieval and comparison
This thesis presents a study of alignment-free methods for genetic sequence comparison. By using representations based on k-mers – short subsequences of length k - sequence similarity can be measured rapidly and accurately by calculating the distance between these paired representations. This research utilises and adapts conventional methods from information retrieval to generate novel representations for k-mers and sequence fragments. Precision was further improved through the use of machine learning approaches - especially neural networks - to learn relationships between k-mers and to generate enhanced sequence representations. These approaches have applications in large scale sequence comparison, especially in the analysis of metagenomic samples.