An approach for accessing data from hidden web using intelligent agent technology

Lohit Singh, Dilip Kumar Sharma
{"title":"An approach for accessing data from hidden web using intelligent agent technology","authors":"Lohit Singh, Dilip Kumar Sharma","doi":"10.1109/IADCC.2013.6514329","DOIUrl":null,"url":null,"abstract":"There is large amount of information available on web, which is hidden from users. This is because such information is not able to be accessed or indexed by traditional search engines. These search engines are only able to crawl information by following hypertext links. The forms which require login or any authorization process can be ignored by them. Hidden web refers to that deepest part of the Web which is not available for traditional Web crawlers. Obtaining the content from Hidden web is a challenging task. Today many web sites are containing pages that are dynamic in nature. This dynamic nature of web pages creates a problem for retrieving information for traditional web crawlers. The effort done to solve the given problem is discussed in brief. Then, a comparative study among the earlier defined architecture, considering various parameters, is also shown. By analyzing above methods a framework is proposed which uses an intelligent agent technology for accessing the hidden web.","PeriodicalId":325901,"journal":{"name":"2013 3rd IEEE International Advance Computing Conference (IACC)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 3rd IEEE International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2013.6514329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

There is large amount of information available on web, which is hidden from users. This is because such information is not able to be accessed or indexed by traditional search engines. These search engines are only able to crawl information by following hypertext links. The forms which require login or any authorization process can be ignored by them. Hidden web refers to that deepest part of the Web which is not available for traditional Web crawlers. Obtaining the content from Hidden web is a challenging task. Today many web sites are containing pages that are dynamic in nature. This dynamic nature of web pages creates a problem for retrieving information for traditional web crawlers. The effort done to solve the given problem is discussed in brief. Then, a comparative study among the earlier defined architecture, considering various parameters, is also shown. By analyzing above methods a framework is proposed which uses an intelligent agent technology for accessing the hidden web.
一种利用智能代理技术访问隐藏web数据的方法
网络上有大量的信息,这些信息对用户来说是隐藏的。这是因为这些信息不能被传统的搜索引擎访问或索引。这些搜索引擎只能通过跟踪超文本链接来抓取信息。需要登录或任何授权过程的表单可以被他们忽略。隐藏网络指的是网络最深处,传统网络爬虫无法访问的部分。从隐网中获取内容是一项具有挑战性的任务。今天,许多网站都包含动态的页面。网页的这种动态特性给传统的网络爬虫程序检索信息带来了问题。简要地讨论了为解决给定问题所做的努力。然后,在考虑各种参数的情况下,对早期定义的体系结构进行了比较研究。通过对上述方法的分析,提出了一种利用智能代理技术访问隐藏web的框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信