实用的网络数据提取:我们到了吗?-简短调查

Andreas Schulz, Jörg Lässig, M. Gaedke
{"title":"实用的网络数据提取:我们到了吗?-简短调查","authors":"Andreas Schulz, Jörg Lässig, M. Gaedke","doi":"10.1109/WI.2016.0096","DOIUrl":null,"url":null,"abstract":"The number of web documents as well as the inherent data and information is growing at a rapid pace. The interest in extracting and utilizing this data is rising likewise. The prospects that are unlocked by Web Data Extraction to its users are as broad as the extensiveness of topics and fields on the Web. The major obstacle is to utilize the available data, contents and processes. Several, mostly older survey papers have already shown developments and approaches to solve Web Data Extraction tasks, but there is a need for a more up-to-date review, showing the latest developments. Additionally when looking from the user perspective, there is still a gap between research results and practical applicability. Available solutions, including research results, commercial products and open source solutions lack certain capabilities or suffer from severe usability issues. This paper therefore gives a short review of the state of the art in Web Data Extraction and relates this to the practical application of these technologies.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"29 1","pages":"562-567"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Practical Web Data Extraction: Are We There Yet? - A Short Survey\",\"authors\":\"Andreas Schulz, Jörg Lässig, M. Gaedke\",\"doi\":\"10.1109/WI.2016.0096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of web documents as well as the inherent data and information is growing at a rapid pace. The interest in extracting and utilizing this data is rising likewise. The prospects that are unlocked by Web Data Extraction to its users are as broad as the extensiveness of topics and fields on the Web. The major obstacle is to utilize the available data, contents and processes. Several, mostly older survey papers have already shown developments and approaches to solve Web Data Extraction tasks, but there is a need for a more up-to-date review, showing the latest developments. Additionally when looking from the user perspective, there is still a gap between research results and practical applicability. Available solutions, including research results, commercial products and open source solutions lack certain capabilities or suffer from severe usability issues. This paper therefore gives a short review of the state of the art in Web Data Extraction and relates this to the practical application of these technologies.\",\"PeriodicalId\":6513,\"journal\":{\"name\":\"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"volume\":\"29 1\",\"pages\":\"562-567\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2016.0096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2016.0096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

网络文档的数量以及固有的数据和信息正在快速增长。对提取和利用这些数据的兴趣也在上升。Web Data Extraction为用户打开的前景就像Web上的主题和领域一样广阔。主要的障碍是如何利用现有的数据、内容和流程。一些(主要是较老的)调查论文已经展示了解决Web Data Extraction任务的发展和方法,但是需要一个更新的综述,展示最新的发展。此外,从用户的角度来看,研究成果与实际适用性之间还存在差距。可用的解决方案,包括研究成果、商业产品和开源解决方案缺乏某些功能,或者存在严重的可用性问题。因此,本文简要回顾了Web数据提取技术的现状,并将其与这些技术的实际应用联系起来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Practical Web Data Extraction: Are We There Yet? - A Short Survey
The number of web documents as well as the inherent data and information is growing at a rapid pace. The interest in extracting and utilizing this data is rising likewise. The prospects that are unlocked by Web Data Extraction to its users are as broad as the extensiveness of topics and fields on the Web. The major obstacle is to utilize the available data, contents and processes. Several, mostly older survey papers have already shown developments and approaches to solve Web Data Extraction tasks, but there is a need for a more up-to-date review, showing the latest developments. Additionally when looking from the user perspective, there is still a gap between research results and practical applicability. Available solutions, including research results, commercial products and open source solutions lack certain capabilities or suffer from severe usability issues. This paper therefore gives a short review of the state of the art in Web Data Extraction and relates this to the practical application of these technologies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信