{"title":"基于本体的基于C4.5和Naïve贝叶斯分类器的网页分类系统","authors":"Hnin Pwint Myu Wai, Phyu Phyu Tar, P. Thwe","doi":"10.1109/ICIIBMS.2018.8549994","DOIUrl":null,"url":null,"abstract":"Today, web is a huge repository of information which needs for accurate automated classifiers for Web pages. Classification of Web page is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. So, this system proposes as the web page classification system based on semantic logic. For semantic, this system uses the ontology that stores each concept of each word. For classification, this system proposes the enhanced C4.5 decision tree and Naive Bayesian (NB) classifiers. In the original C4.5 classification algorithm, the traditional entropy measure is unable to measure the appropriateness of nodes when the class labels are the same. By using semantic technology, this system can effectively support to classify web pages into each category. To show the effectiveness, this system is tested by using HTML documents in the computer science domain.","PeriodicalId":430326,"journal":{"name":"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","volume":"55 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Ontology Based Web Page Classification System by Using Enhanced C4.5 and Naïve Bayesian Classifiers\",\"authors\":\"Hnin Pwint Myu Wai, Phyu Phyu Tar, P. Thwe\",\"doi\":\"10.1109/ICIIBMS.2018.8549994\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, web is a huge repository of information which needs for accurate automated classifiers for Web pages. Classification of Web page is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. So, this system proposes as the web page classification system based on semantic logic. For semantic, this system uses the ontology that stores each concept of each word. For classification, this system proposes the enhanced C4.5 decision tree and Naive Bayesian (NB) classifiers. In the original C4.5 classification algorithm, the traditional entropy measure is unable to measure the appropriateness of nodes when the class labels are the same. By using semantic technology, this system can effectively support to classify web pages into each category. To show the effectiveness, this system is tested by using HTML documents in the computer science domain.\",\"PeriodicalId\":430326,\"journal\":{\"name\":\"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)\",\"volume\":\"55 5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIIBMS.2018.8549994\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIBMS.2018.8549994","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ontology Based Web Page Classification System by Using Enhanced C4.5 and Naïve Bayesian Classifiers
Today, web is a huge repository of information which needs for accurate automated classifiers for Web pages. Classification of Web page is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. So, this system proposes as the web page classification system based on semantic logic. For semantic, this system uses the ontology that stores each concept of each word. For classification, this system proposes the enhanced C4.5 decision tree and Naive Bayesian (NB) classifiers. In the original C4.5 classification algorithm, the traditional entropy measure is unable to measure the appropriateness of nodes when the class labels are the same. By using semantic technology, this system can effectively support to classify web pages into each category. To show the effectiveness, this system is tested by using HTML documents in the computer science domain.