Automatic classification of Web pages based on the concept of domain ontology

Mu-Hee Song, Soo-Yeon Lim, Dong-Jin Kang, Sang-Jo Lee
{"title":"Automatic classification of Web pages based on the concept of domain ontology","authors":"Mu-Hee Song, Soo-Yeon Lim, Dong-Jin Kang, Sang-Jo Lee","doi":"10.1109/APSEC.2005.46","DOIUrl":null,"url":null,"abstract":"The use of ontology in order to provide a mechanism to enable machine reasoning has continuously increased during the last few years. This paper suggests an automated method for document classification using an ontology, which expresses terminology information and vocabulary contained in Web documents by way of a hierarchical structure. Ontology-based document classification involves determining document features that represent the Web documents most accurately, and classifying them into the most appropriate categories after analyzing their contents by using at least two predefined categories per given document features. In this paper, Web pages are classified in real time not with experimental data or a learning process, but by similar calculations between the terminology information extracted from Web pages and ontology categories. This results in a more accurate document classification since the meanings and relationships unique to each document are determined.","PeriodicalId":359862,"journal":{"name":"12th Asia-Pacific Software Engineering Conference (APSEC'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"12th Asia-Pacific Software Engineering Conference (APSEC'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC.2005.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40

Abstract

The use of ontology in order to provide a mechanism to enable machine reasoning has continuously increased during the last few years. This paper suggests an automated method for document classification using an ontology, which expresses terminology information and vocabulary contained in Web documents by way of a hierarchical structure. Ontology-based document classification involves determining document features that represent the Web documents most accurately, and classifying them into the most appropriate categories after analyzing their contents by using at least two predefined categories per given document features. In this paper, Web pages are classified in real time not with experimental data or a learning process, but by similar calculations between the terminology information extracted from Web pages and ontology categories. This results in a more accurate document classification since the meanings and relationships unique to each document are determined.
基于领域本体概念的网页自动分类
在过去的几年里,为了提供一种机制来实现机器推理,使用本体的情况不断增加。本文提出了一种基于本体的文档自动分类方法,本体通过层次结构来表达Web文档中包含的术语信息和词汇。基于本体的文档分类涉及确定最准确地表示Web文档的文档特征,并在对其内容进行分析后,通过对每个给定文档特征使用至少两个预定义的类别,将它们分类到最合适的类别中。在本文中,Web页面的实时分类不是通过实验数据或学习过程,而是通过从Web页面中提取的术语信息和本体类别之间的类似计算。这将导致更准确的文档分类,因为确定了每个文档的惟一含义和关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信