及时恢复丢失的网页

Terry L. Harrison, Michael L. Nelson
{"title":"及时恢复丢失的网页","authors":"Terry L. Harrison, Michael L. Nelson","doi":"10.1145/1149941.1149971","DOIUrl":null,"url":null,"abstract":"We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of \"in vivo\" preservation: harnessing the collective behavior of web archives, commercial search engines, and research projects for the purpose of preservation. Opal servers learn from their experiences and are able to share their knowledge with other Opal servers by mutual harvesting using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Using cached copies that can be found on the web, Opal creates lexical signatures which are then used to search for similar versions of the web page. We present the architecture of the Opal framework, discuss a reference implementation of the framework, and present a quantitative analysis of the framework that indicates that Opal could be effectively deployed.","PeriodicalId":134809,"journal":{"name":"UK Conference on Hypertext","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"Just-in-time recovery of missing web pages\",\"authors\":\"Terry L. Harrison, Michael L. Nelson\",\"doi\":\"10.1145/1149941.1149971\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of \\\"in vivo\\\" preservation: harnessing the collective behavior of web archives, commercial search engines, and research projects for the purpose of preservation. Opal servers learn from their experiences and are able to share their knowledge with other Opal servers by mutual harvesting using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Using cached copies that can be found on the web, Opal creates lexical signatures which are then used to search for similar versions of the web page. We present the architecture of the Opal framework, discuss a reference implementation of the framework, and present a quantitative analysis of the framework that indicates that Opal could be effectively deployed.\",\"PeriodicalId\":134809,\"journal\":{\"name\":\"UK Conference on Hypertext\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"UK Conference on Hypertext\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1149941.1149971\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"UK Conference on Hypertext","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1149941.1149971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 41

摘要

我们提出了Opal,一个轻量级的框架,用于交互式地定位丢失的网页(http状态码404)。Opal是“活体”保存的一个例子:利用网络档案、商业搜索引擎和研究项目的集体行为来保存。Opal服务器从他们的经验中学习,并且能够通过使用开放档案倡议元数据收集协议(OAI-PMH)的相互收集,与其他Opal服务器共享他们的知识。使用可以在网络上找到的缓存副本,Opal创建词法签名,然后用于搜索网页的类似版本。我们介绍了Opal框架的体系结构,讨论了该框架的参考实现,并对该框架进行了定量分析,表明Opal可以有效地部署。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Just-in-time recovery of missing web pages
We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of "in vivo" preservation: harnessing the collective behavior of web archives, commercial search engines, and research projects for the purpose of preservation. Opal servers learn from their experiences and are able to share their knowledge with other Opal servers by mutual harvesting using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Using cached copies that can be found on the web, Opal creates lexical signatures which are then used to search for similar versions of the web page. We present the architecture of the Opal framework, discuss a reference implementation of the framework, and present a quantitative analysis of the framework that indicates that Opal could be effectively deployed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信