{"title":"Web Pages Classification using Concept Analysis","authors":"G. D. Lucca, A. R. Fasolino, Porfirio Tramontana","doi":"10.1109/ICSM.2007.4362651","DOIUrl":null,"url":null,"abstract":"Analysis and classification of Web application user interfaces is a relevant problem in Web maintenance processes. This paper presents an approach for the reliable classification of HTML pages of a dynamic Web application. The approach is based on the assumption that groups of semantically equivalent built pages are characterized by the same key features which can be used for discriminating the pages. These features are obtained by an iterative process that exploits formal concept analysis for finding features that are specific for each class of pages. The process is supported by a toolkit that allows an effective definition of the discriminating features. The approach has been preliminarily validated with an experiment that produced encouraging results.","PeriodicalId":263470,"journal":{"name":"2007 IEEE International Conference on Software Maintenance","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Software Maintenance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSM.2007.4362651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Analysis and classification of Web application user interfaces is a relevant problem in Web maintenance processes. This paper presents an approach for the reliable classification of HTML pages of a dynamic Web application. The approach is based on the assumption that groups of semantically equivalent built pages are characterized by the same key features which can be used for discriminating the pages. These features are obtained by an iterative process that exploits formal concept analysis for finding features that are specific for each class of pages. The process is supported by a toolkit that allows an effective definition of the discriminating features. The approach has been preliminarily validated with an experiment that produced encouraging results.