基于Web使用挖掘、Web抓取和语义注释的信息提取

S. K. Malik, S. Rizvi
{"title":"基于Web使用挖掘、Web抓取和语义注释的信息提取","authors":"S. K. Malik, S. Rizvi","doi":"10.1109/CICN.2011.97","DOIUrl":null,"url":null,"abstract":"Extracting useful information from the web is the most significant issue of concern for the realization of semantic web. This may be achieved by several ways among which Web Usage Mining, Web Scrapping and Semantic Annotation plays an important role. Web mining enables to find out the relevant results from the web and is used to extract meaningful information from the discovery patterns kept back in the servers. Web usage mining is a type of web mining which mines the information of access routes/manners of users visiting the web sites. Web scraping, another technique, is a process of extracting useful information from HTML pages which may be implemented using a scripting language known as Prolog Server Pages(PSP) based on Prolog. Third, Semantic annotation is a technique which makes it possible to add semantics and a formal structure to unstructured textual documents, an important aspect in semantic information extraction which may be performed by a tool known as KIM(Knowledge Information Management). In this paper, we revisit, explore and discuss some information extraction techniques on web like web usage mining, web scrapping and semantic annotation for a better or efficient information extraction on the web illustrated with examples.","PeriodicalId":292190,"journal":{"name":"2011 International Conference on Computational Intelligence and Communication Networks","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"68","resultStr":"{\"title\":\"Information Extraction Using Web Usage Mining, Web Scrapping and Semantic Annotation\",\"authors\":\"S. K. Malik, S. Rizvi\",\"doi\":\"10.1109/CICN.2011.97\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extracting useful information from the web is the most significant issue of concern for the realization of semantic web. This may be achieved by several ways among which Web Usage Mining, Web Scrapping and Semantic Annotation plays an important role. Web mining enables to find out the relevant results from the web and is used to extract meaningful information from the discovery patterns kept back in the servers. Web usage mining is a type of web mining which mines the information of access routes/manners of users visiting the web sites. Web scraping, another technique, is a process of extracting useful information from HTML pages which may be implemented using a scripting language known as Prolog Server Pages(PSP) based on Prolog. Third, Semantic annotation is a technique which makes it possible to add semantics and a formal structure to unstructured textual documents, an important aspect in semantic information extraction which may be performed by a tool known as KIM(Knowledge Information Management). In this paper, we revisit, explore and discuss some information extraction techniques on web like web usage mining, web scrapping and semantic annotation for a better or efficient information extraction on the web illustrated with examples.\",\"PeriodicalId\":292190,\"journal\":{\"name\":\"2011 International Conference on Computational Intelligence and Communication Networks\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"68\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Computational Intelligence and Communication Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CICN.2011.97\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Computational Intelligence and Communication Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICN.2011.97","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 68

摘要

从web中提取有用的信息是实现语义web最重要的问题。这可以通过几种方法来实现,其中Web Usage Mining、Web Scrapping和Semantic Annotation起着重要的作用。Web挖掘能够从Web中找到相关的结果,并用于从保存在服务器中的发现模式中提取有意义的信息。Web使用挖掘是一种挖掘用户访问网站的路径/方式信息的Web挖掘方法。网络抓取是另一种技术,它是从HTML页面中提取有用信息的过程,可以使用基于Prolog的脚本语言Prolog Server pages (PSP)来实现。第三,语义注释是一种为非结构化文本文档添加语义和正式结构的技术,这是语义信息提取的一个重要方面,可以通过称为KIM(知识信息管理)的工具来执行。本文对web上的一些信息提取技术进行了回顾、探索和讨论,如web使用挖掘、web废弃和语义注释等,以期更好或更有效地在web上进行信息提取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Information Extraction Using Web Usage Mining, Web Scrapping and Semantic Annotation
Extracting useful information from the web is the most significant issue of concern for the realization of semantic web. This may be achieved by several ways among which Web Usage Mining, Web Scrapping and Semantic Annotation plays an important role. Web mining enables to find out the relevant results from the web and is used to extract meaningful information from the discovery patterns kept back in the servers. Web usage mining is a type of web mining which mines the information of access routes/manners of users visiting the web sites. Web scraping, another technique, is a process of extracting useful information from HTML pages which may be implemented using a scripting language known as Prolog Server Pages(PSP) based on Prolog. Third, Semantic annotation is a technique which makes it possible to add semantics and a formal structure to unstructured textual documents, an important aspect in semantic information extraction which may be performed by a tool known as KIM(Knowledge Information Management). In this paper, we revisit, explore and discuss some information extraction techniques on web like web usage mining, web scrapping and semantic annotation for a better or efficient information extraction on the web illustrated with examples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信