Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries最新文献

筛选
英文 中文
The Archival Acid Test: Evaluating archive performance on advanced HTML and JavaScript 存档酸性测试:在高级HTML和JavaScript上评估存档性能
Mat Kelly, Michael L. Nelson, Michele C. Weigle
{"title":"The Archival Acid Test: Evaluating archive performance on advanced HTML and JavaScript","authors":"Mat Kelly, Michael L. Nelson, Michele C. Weigle","doi":"10.1109/JCDL.2014.6970146","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970146","url":null,"abstract":"When preserving web pages, archival crawlers sometimes produce a result that varies from what an end-user expects. To quantitatively evaluate the degree to which an archival crawler is capable of comprehensively reproducing a web page from the live web into the archives, the crawlers' capabilities must be evaluated. In this paper, we propose a set of metrics to evaluate the capability of archival crawlers and other preservation tools using the Acid Test concept. For a variety of web preservation tools, we examine previous captures within web archives and note the features that produce incomplete or unexpected results. From there, we design the test to produce a quantitative measure of how well each tool performs its task.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"95 1","pages":"25-28"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90523481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
PageRank-based Word Sense Induction within Web Search Results Clustering 网页搜索结果聚类中基于pagerank的词义归纳
Jose G. Moreno, G. Dias
{"title":"PageRank-based Word Sense Induction within Web Search Results Clustering","authors":"Jose G. Moreno, G. Dias","doi":"10.1109/JCDL.2014.6970227","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970227","url":null,"abstract":"Word Sense Induction is an open problem in Natural Language Processing. Many recent works have been addressing this problem with a wide spectrum of strategies based on content analysis. In this paper, we present a sense induction strategy exclusively based on link analysis over the Web. In particular, we explore the idea that the main different senses of a given word share similar linking properties and can be found by performing clustering with link-based similarity metrics. The evaluation results show that PageRank-based sense induction achieves interesting results when compared to state-of-the-art content-based algorithms in the context of Web Search Results Clustering.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"145 1","pages":"465-466"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89086956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Crowd-sourcing Web knowledge for metadata extraction 用于元数据提取的众包网络知识
Zhaohui Wu, W. Huang, Chen Liang, C. Lee Giles
{"title":"Crowd-sourcing Web knowledge for metadata extraction","authors":"Zhaohui Wu, W. Huang, Chen Liang, C. Lee Giles","doi":"10.1109/JCDL.2014.6970160","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970160","url":null,"abstract":"We explore a new metadata extraction framework without human annotators with the ground truth harvested from Web. A new training sample is selected based on not only the uncertainty and representativeness in the unlabeled pool, but also on its availability and credibility in Web knowledge bases. We construct a dataset of 4329 books with valid metadata and evaluate our approach using 5 Web book databases as oracles. Empirical results demonstrate its effectiveness and efficiency.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"135 1","pages":"141-144"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86424825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The value of risk management for data management in science and engineering 风险管理对科学与工程数据管理的价值
Filipe Ferreira, Ricardo Vieira, J. Borbinha
{"title":"The value of risk management for data management in science and engineering","authors":"Filipe Ferreira, Ricardo Vieira, J. Borbinha","doi":"10.1109/JCDL.2014.6970214","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970214","url":null,"abstract":"An established concept to address data management challenges in science and engineering is the Data Management Plans. However, we claim that in some complex scenarios the actual principles for Data Management Plans might not be enough, especially when Risk Management turns to be relevant. Therefore, we propose a method, based on the ISO 31000, for science and engineering projects to create a Risk Management Plan that can complement the Data Management Plan. The validation of this proposal is presented in the real case of an engineering laboratory.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"29 6 1","pages":"439-440"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81444292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Human and machine error analysis on dependency parsing of ancient Greek texts 古希腊文本依存句法的人误与机误分析
Saeed Majidi, G. Crane
{"title":"Human and machine error analysis on dependency parsing of ancient Greek texts","authors":"Saeed Majidi, G. Crane","doi":"10.1109/JCDL.2014.6970171","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970171","url":null,"abstract":"Automatically generated metadata from large collections is an essential component of digital libraries. It is beginning to emerge as fundamental to the study of languages. Morphosyntactic annotation captures the form of individual words and their function. Nonetheless automated syntactic analysis is still imperfect and human annotators can be significantly more accurate. On the other hand, human work is expensive and even humans find some constructions difficult to annotate correctly. Comparing the performance of human annotators with that of an automatic parser is thus important for exploring how the two methods can best be combined. In the present study, we compare the frequency of the different types of errors made by student annotators with those made by different dependency parsers when annotating ancient Greek. With a few exceptions, the frequency of the different types of errors was similar for human and machine. The significance of these results is briefly discussed.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"4 1","pages":"221-224"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81664669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Research networks in data repositories 研究数据存储库中的网络
Mark R. Costa, Jian Qin, Jun Wang
{"title":"Research networks in data repositories","authors":"Mark R. Costa, Jian Qin, Jun Wang","doi":"10.1109/JCDL.2014.6970197","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970197","url":null,"abstract":"This paper reports our ongoing work investigating the structural features of scientific collaboration based on metadata collected from a scientific data repository (SDR). The background literature is reviewed in supporting our claim that metadata collected from SDRs offer a complimentary data source to traditional publication metadata collected from digital libraries. Methodological considerations are discussed in association with using metadata from SDRs, including author name disambiguation and data parsing. Initial findings show that the network has some unique macro-level structural features while also in agreement with existing networks theories. Challenges due to inconsistent metadata quality control procedures are also discussed in an attempt to reinforce claims that metadata should be designed to support both domain specific retrieval and evaluation and assessment needs.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"83 1","pages":"403-406"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84428949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mink: Integrating the live and archived web viewing experience using web browsers and memento Mink:使用网页浏览器和纪念品整合实时和存档的网页浏览体验
Mat Kelly, Michael L. Nelson, Michele C. Weigle
{"title":"Mink: Integrating the live and archived web viewing experience using web browsers and memento","authors":"Mat Kelly, Michael L. Nelson, Michele C. Weigle","doi":"10.1109/JCDL.2014.6970229","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970229","url":null,"abstract":"We describe Mink, a new web browser extension that provides a different model for integration of the live and archived web. While a user browses the live web, Mink actively queries the archives and reports other instances of the page in the archives without requiring active querying by the user. Further, by querying the archives dynamically and asynchronously, a user can view the extent to which the currently viewed page on the live web has been archived and proactively submit a request to various archives using an overlay on the live web page and a simple interface.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"42 1","pages":"469-470"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86737864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Implementing Digital Preservation Strategy: Developing content collection profiles at the British Library 实施数字保存策略:发展大英图书馆的内容收藏概况
M. Day, A. MacDonald, M. Pennock, Akiko Kimura
{"title":"Implementing Digital Preservation Strategy: Developing content collection profiles at the British Library","authors":"M. Day, A. MacDonald, M. Pennock, Akiko Kimura","doi":"10.1109/JCDL.2014.6970145","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970145","url":null,"abstract":"The British Library is increasingly a digital library. Through both digitization and acquisition, it has built up significant collections of digital content covering a very wide range of content types. Most recently, the extension of legal deposit provisions to non-print works in 2013 has meant that it - working in conjunction with the other UK legal deposit libraries - has begun to collect new categories of digital content, including periodic harvests of the UK Web domain. In order to support this, the Library has also invested heavily in developing scalable infrastructures for the acquisition, storage and management of large amounts of digital content. The British Library Digital Preservation Strategy, 2013-2016 is focused on the embedding of digital sustainability as an organizational principle across the Library and to help manage preservation risks and challenges across all digital collection content lifecycles. This practice paper describes work being undertaken by the Digital Preservation Team at the British Library to develop content profiles of high-level digital collections that will support the implementation of the strategy, in particular for the capture of long-term preservation requirements.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"8 1","pages":"21-24"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85788003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The anatomy of a search and mining system for digital humanities 数字人文学科的搜索和挖掘系统剖析
Martyn Harris, M. Levene, Dell Zhang, D. Levene
{"title":"The anatomy of a search and mining system for digital humanities","authors":"Martyn Harris, M. Levene, Dell Zhang, D. Levene","doi":"10.1109/JCDL.2014.6970163","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970163","url":null,"abstract":"Samtla (Search And Mining Tools with Linguistic Analysis) is an online integrated research environment designed in collaboration with historians and linguists to facilitate the study of digitised texts written in any language. It currently supports the research of two corpora: the Genizah collection held by the Taylor-Schechter Genizah Research Unit in Cambridge University, and a collection of Aramaic incantation texts from late antiquity. In contrast to standard search engines and text mining systems that rely on the bag-of-words representation of text, Samtla provides the retrieval and discovery of fuzzy text patterns/motifs (aka “formulae” to historians), which is achieved through applying a character-based n-gram statistical language model built on top of a powerful generalised suffix tree data structure. This paper brie y describes the major components of Samtla and their underlying techniques.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"11 1","pages":"165-168"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80280288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Explore the stacks: A system for exploration in large digital libraries 探索堆栈:一个用于大型数字图书馆探索的系统
M. Hall
{"title":"Explore the stacks: A system for exploration in large digital libraries","authors":"M. Hall","doi":"10.5555/2740769.2740845","DOIUrl":"https://doi.org/10.5555/2740769.2740845","url":null,"abstract":"Providing access to large digital library collections to novice users requires novel interfaces that are not built around the concept of search, as novice users frequently struggle to formulate appropriate queries. This paper presents the “Explore the Stacks” system, which provides a novel, browsing-focused interface for exploring digital library collections that is applicable to Big Data scale digital libraries. The system is demonstrated using a collection of approximately one million book illustrations provided by the British Library.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"1 1","pages":"417-418"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79898587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信