{"title":"Ontology Based Web Page Classification System by Using Enhanced C4.5 and Naïve Bayesian Classifiers","authors":"Hnin Pwint Myu Wai, Phyu Phyu Tar, P. Thwe","doi":"10.1109/ICIIBMS.2018.8549994","DOIUrl":null,"url":null,"abstract":"Today, web is a huge repository of information which needs for accurate automated classifiers for Web pages. Classification of Web page is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. So, this system proposes as the web page classification system based on semantic logic. For semantic, this system uses the ontology that stores each concept of each word. For classification, this system proposes the enhanced C4.5 decision tree and Naive Bayesian (NB) classifiers. In the original C4.5 classification algorithm, the traditional entropy measure is unable to measure the appropriateness of nodes when the class labels are the same. By using semantic technology, this system can effectively support to classify web pages into each category. To show the effectiveness, this system is tested by using HTML documents in the computer science domain.","PeriodicalId":430326,"journal":{"name":"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","volume":"55 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIBMS.2018.8549994","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Today, web is a huge repository of information which needs for accurate automated classifiers for Web pages. Classification of Web page is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. So, this system proposes as the web page classification system based on semantic logic. For semantic, this system uses the ontology that stores each concept of each word. For classification, this system proposes the enhanced C4.5 decision tree and Naive Bayesian (NB) classifiers. In the original C4.5 classification algorithm, the traditional entropy measure is unable to measure the appropriateness of nodes when the class labels are the same. By using semantic technology, this system can effectively support to classify web pages into each category. To show the effectiveness, this system is tested by using HTML documents in the computer science domain.