{"title":"A new Granular Computing approach for sequences representation and classification","authors":"A. Rizzi, G. D. Vescovo, L. Livi, F. Mascioli","doi":"10.1109/IJCNN.2012.6252680","DOIUrl":null,"url":null,"abstract":"In this paper we present an innovative procedure for sequence mining and representation. It can be used as its own in Data Mining problems or as the core of a classification system based on a Granular Computing approach to represent sequences in a suited embedding space. By adopting an inexact sequence matching procedure, the algorithm is able to extract a symbols alphabet of frequent subsequences to be used as prototypes for the embedding stage. Experimental evaluation over both synthetically generated and biological datasets confirms that the modeling system is able to synthesize effective models when facing even complex and noisy problems defined by frequency-based classification rules.","PeriodicalId":287844,"journal":{"name":"The 2012 International Joint Conference on Neural Networks (IJCNN)","volume":"12 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2012 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2012.6252680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
In this paper we present an innovative procedure for sequence mining and representation. It can be used as its own in Data Mining problems or as the core of a classification system based on a Granular Computing approach to represent sequences in a suited embedding space. By adopting an inexact sequence matching procedure, the algorithm is able to extract a symbols alphabet of frequent subsequences to be used as prototypes for the embedding stage. Experimental evaluation over both synthetically generated and biological datasets confirms that the modeling system is able to synthesize effective models when facing even complex and noisy problems defined by frequency-based classification rules.