Generating links by mining quotations

O. Kolak, Bill N. Schilit
{"title":"Generating links by mining quotations","authors":"O. Kolak, Bill N. Schilit","doi":"10.1145/1379092.1379117","DOIUrl":null,"url":null,"abstract":"Scanning books, magazines, and newspapers has become a widespread activity because people believe that much of the worlds information still resides off-line. In general after works are scanned they are indexed for search and processed to add links. This paper describes a new approach to automatically add links by mining popularly quoted passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work, facilitating navigation. This paper makes three contributions. We describe a scalable algorithm for mining repeated word sequences from extremely large text corpora. Second, we present techniques that filter and rank the repeated sequences for quotations. Third, we present a new user interface for navigating across and within works in the collection using quotation links. Our system has been run on a digital library of over 1 million books and has been used by thousands of people.","PeriodicalId":285799,"journal":{"name":"Proceedings of the nineteenth ACM conference on Hypertext and hypermedia","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the nineteenth ACM conference on Hypertext and hypermedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1379092.1379117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

Abstract

Scanning books, magazines, and newspapers has become a widespread activity because people believe that much of the worlds information still resides off-line. In general after works are scanned they are indexed for search and processed to add links. This paper describes a new approach to automatically add links by mining popularly quoted passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work, facilitating navigation. This paper makes three contributions. We describe a scalable algorithm for mining repeated word sequences from extremely large text corpora. Second, we present techniques that filter and rank the repeated sequences for quotations. Third, we present a new user interface for navigating across and within works in the collection using quotation links. Our system has been run on a digital library of over 1 million books and has been used by thousands of people.
通过挖掘报价生成链接
扫描书籍、杂志和报纸已经成为一种广泛的活动,因为人们相信世界上的许多信息仍然存在于离线状态。一般来说,在作品被扫描后,它们被索引以供搜索和处理以添加链接。本文介绍了一种通过挖掘常用引用段落来自动添加链接的方法。我们的技术将语义丰富的元素连接起来,因此建立了牢固的关系。此外,链接指向作品中的点,方便导航。本文有三个贡献。我们描述了一种可扩展的算法,用于从超大文本语料库中挖掘重复词序列。其次,我们提出了对重复序列进行过滤和排序的技术。第三,我们提供了一个新的用户界面,可以使用报价链接在集合中的作品之间和内部进行导航。我们的系统已经在一个拥有100多万本书的数字图书馆上运行,并被数千人使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信