S. Nasser, A. Breland, Frederick C. Harris, Monica Nicolescu
{"title":"A fuzzy classifier to taxonomically group DNA fragments within a metagenome","authors":"S. Nasser, A. Breland, Frederick C. Harris, Monica Nicolescu","doi":"10.1109/NAFIPS.2008.4531252","DOIUrl":null,"url":null,"abstract":"Extracting microorganisms from their natural environment has become a popular technique. These metagenomic fragments lack enough information that can mark them into taxonomic groups. In this paper, we implement a fuzzy k-means classifier to separate fragments into taxonomic groups present in a metagenomic data set. The fuzzy classifier is used to group shotgun sequence fragments as small as 500 base pairs according to their DNA signatures, namely GC content and oligonucleotide frequencies. A comparison of using different signatures is done and we analyze results and compare them. The classifier is also tested to classify acid mine drainage metagenome into classes to represent the major Archea and Bacteria groups. The classification achieved an accuracy of 99% for acid mine drainage a published environmental genome sample.","PeriodicalId":430770,"journal":{"name":"NAFIPS 2008 - 2008 Annual Meeting of the North American Fuzzy Information Processing Society","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAFIPS 2008 - 2008 Annual Meeting of the North American Fuzzy Information Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2008.4531252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Extracting microorganisms from their natural environment has become a popular technique. These metagenomic fragments lack enough information that can mark them into taxonomic groups. In this paper, we implement a fuzzy k-means classifier to separate fragments into taxonomic groups present in a metagenomic data set. The fuzzy classifier is used to group shotgun sequence fragments as small as 500 base pairs according to their DNA signatures, namely GC content and oligonucleotide frequencies. A comparison of using different signatures is done and we analyze results and compare them. The classifier is also tested to classify acid mine drainage metagenome into classes to represent the major Archea and Bacteria groups. The classification achieved an accuracy of 99% for acid mine drainage a published environmental genome sample.