Biligsaikhan Batjargal, T. Kuyama, Fuminori Kimura, Akira Maeda
{"title":"Identifying the same records across multiple Ukiyo-e image databases using textual data in different languages","authors":"Biligsaikhan Batjargal, T. Kuyama, Fuminori Kimura, Akira Maeda","doi":"10.1109/JCDL.2014.6970167","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970167","url":null,"abstract":"This paper proposes a novel method for identifying the same records across multiple databases in different languages. In order to identify the same records, we calculate the similarities between records by comparing the text values of metadata elements. The proposed method, i.e. finding the same records across multiple databases, will help users to know which organization has a certain record and its customized versions regardless of languages and differences in formats. Although the proposed approach was demonstrated on Japanese Ukiyo-e databases, it might be applicable to other disciplines for bridging the gaps between databases in different languages.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"1 1","pages":"193-196"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87792317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Artini, Claudio Atzori, A. Bardi, Sandro La Bruzzo, P. Manghi
{"title":"TagTick: A tool for annotation tagging over solr indexes","authors":"M. Artini, Claudio Atzori, A. Bardi, Sandro La Bruzzo, P. Manghi","doi":"10.1109/JCDL.2014.6970198","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970198","url":null,"abstract":"“Annotation tagging” is an important curation action performed by authorized data curators willing to classify according to a common vocabulary an Information Space of potentially heterogeneous objects (e.g. not sharing common classification schemes). To carry out their activities, data curators need annotation tagging tools which allow them to bulk tag or untag large sets of objects in temporary work sessions, where they can experiment in real-time the effect of their actions before making the changes visible to end-users. Real-time temporary bulk tagging is a non trivial feature to implement, which strictly depends on the back-end used to index the Information Space. This demo presents TagTick, a tool which offers to data curators a fully functional annotation tagging environment over full-text index Apache Solr, considered a “de facto standard” in the field.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"43 1","pages":"407-408"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84541093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When catalogs collide: A mashup of the bibliographic records from New Zealand's National Bibliography and the HathiTrust","authors":"Steffan Safey, D. Bainbridge","doi":"10.1109/JCDL.2014.6970205","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970205","url":null,"abstract":"In this article we present work done developing an interactive comparison tool for large-scale catalogs using the general purpose open source digital library toolkit, Greenstone. The two catalogs selected to demonstrate the approach were the Bibliographic Records from New Zealand's National Bibliography and the HathiTrust. With Greenstone's triple-store extension activated, the two collections were ingested to form two Greenstone collections. Next, an interactive visualization tool was developed within the digital library's presentation layer to allow users to explore the two collections, comparing fields from the two collections and producing a variety of visualizations. The required interactivity was accomplished using AJAX calls to the Greenstone triple-store, further supported by the use of Javascript libraries for the presentation of the retrieved data in both visual and spreadsheet forms.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"57 1","pages":"421-422"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84599568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kentaro Takano, Hirohito Shibata, Junko Ichino, T. Hashiyama, S. Tano
{"title":"Microscopic analysis of document handling while reading: Classification of behavior toward paper document","authors":"Kentaro Takano, Hirohito Shibata, Junko Ichino, T. Hashiyama, S. Tano","doi":"10.1109/JCDL.2014.6970217","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970217","url":null,"abstract":"We conducted a microscopic analysis of work-related reading to find ways to support reading in the workplace. We obtained empirical data from video recording, concurrent verbal reporting, and retrospective reporting of 18 participants in 10 target types of reading using paper. Using these data, we categorized the ways people interact with paper while reading in detail. We will discuss what kinds of support are required for work-related reading.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"39 1","pages":"445-446"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86588941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing digital humanities methods for quantifying Howell's State Trials","authors":"Tracy Bergstrom, Donald Brower, N. Meyers","doi":"10.1109/JCDL.2014.6970215","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970215","url":null,"abstract":"In this paper we describe the undertaking of a quantitative, historically oriented analysis of the law of England between 1650-1700 as represented in Howell's State Trials. Our goal was to analyze cases over time to support investigation into whether a quantitative analysis of the content of the 1650-1700 State Trials would exhibit an upward trend of religious tolerance.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"22 1","pages":"441-442"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89089255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claire Llewellyn, Laine Ruus, Ros Burnett, Steve Kirkwood, Mark Smith, Rocio von Jungenfeld
{"title":"Building a dataset of sensitive information","authors":"Claire Llewellyn, Laine Ruus, Ros Burnett, Steve Kirkwood, Mark Smith, Rocio von Jungenfeld","doi":"10.1109/JCDL.2014.6970241","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970241","url":null,"abstract":"Using text analysis tools to study large data sets is currently an area of popular interest. Prompted by the success of several big data research initiatives, researchers from a variety of disciplines wish to gather and analyse textual data. Communication between members of diverse teams can present a problem and developing a shared language and understanding of the task is necessary.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"16 1","pages":"493-494"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79713128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lend me some sugar: Borrowing rates of neighbouring books as evidence for browsing","authors":"Dana Mckay, Wally Smith, Shanton Chang","doi":"10.1109/JCDL.2014.6970161","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970161","url":null,"abstract":"There is more to choosing a book than simply keyword searching. Browsing is a fundamental part of the information seeking process, and one that information seekers profess to value, though it has attracted little study. This dearth of research is undoubtedly in part because browsing is nebulous and difficult to quantify. In this paper we use a large circulation dataset from an academic library consortium to examine whether books in the library stacks are loaned in clusters, with a view firstly to confirming the existence of book browsing that has been reported anecdotally, and secondly to quantifying its impact on loan patterns.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"3 1","pages":"145-154"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72962168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Borgman, P. Darch, A. Sands, J. Wallis, Sharon Traweek
{"title":"The ups and downs of knowledge infrastructures in science: Implications for data management","authors":"C. Borgman, P. Darch, A. Sands, J. Wallis, Sharon Traweek","doi":"10.1109/JCDL.2014.6970177","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970177","url":null,"abstract":"The promise of technology-enabled, data-intensive scholarship is predicated upon access to knowledge infrastructures that are not yet in place. Scientific data management requires expertise in the scientific domain and in organizing and retrieving complex research objects. The Knowledge Infrastructures project compares data management activities of four large, distributed, multidisciplinary scientific endeavors as they ramp their activities up or down; two are big science and two are small science. Research questions address digital library solutions, knowledge infrastructure concerns, issues specific to individual domains, and common problems across domains. Findings are based on interviews (n=113 to date), ethnography, and other analyses of these four cases, studied since 2002. Based on initial comparisons, we conclude that the roles of digital libraries in scientific data management often depend upon the scale of data, the scientific goals, and the temporal scale of the research projects being supported. Digital libraries serve immediate data management purposes in some projects and long-term stewardship in others. In small science projects, data management tools are selected, designed, and used by the same individuals. In the multi-decade time scale of some big science research, data management technologies, policies, and practices are designed for anticipated future uses and users. The need for library, archival, and digital library expertise is apparent throughout all four of these cases. Managing research data is a knowledge infrastructure problem beyond the scope of individual researchers or projects. The real challenges lie in designing digital libraries to assist in the capture, management, interpretation, use, reuse, and stewardship of research data.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"33 1","pages":"257-266"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73589949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Articles, papers, chapters, theses - who wins the visibility wars?","authors":"M. Weideman","doi":"10.1109/JCDL.2014.6970234","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970234","url":null,"abstract":"Researchers need access to previous research to base their own work on. Some of the most commonly referenced materials are published in the form of journal articles, conference papers, books and book chapters, and research theses. The purpose of this research was to determine how these four categories of documents compare in terms of visibility to search engine crawlers. A questionnaire was used to gather data from international scholars on their completed research. Three types of queries were generated and over 3000 Web sites were inspected to determine the visibility of these outputs. Search engine result pages were inspected, and the rankings of the research documents were recorded and converted to a scoring system. The results have indicated that the four types of outputs enjoy varying degrees of exposure to search engines, with journal articles leading the way, and books/book chapters having the smallest degree of exposure to search engines. Some query types also produced better results than others. It was concluded that journal articles provide the best way to expose research work to Internet searchers through search engines.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"69 1","pages":"479-480"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74069794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolaos Aletras, Timothy Baldwin, Jey Han Lau, Mark Stevenson
{"title":"Representing topics labels for exploring digital libraries","authors":"Nikolaos Aletras, Timothy Baldwin, Jey Han Lau, Mark Stevenson","doi":"10.1109/JCDL.2014.6970174","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970174","url":null,"abstract":"Topic models have been shown to be a useful way of representing the content of large document collections, for example via visualisation interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a set of keywords, i.e. the top-n words with highest marginal probability within the topic. However, alternative topic representations have been proposed, including textual and image labels. In this paper, we compare different topic representations, i.e. sets of topic words, textual phrases and images, in a document retrieval task. We asked participants to retrieve relevant documents based on pre-defined queries within a fixed time limit, presenting topics in one of the following modalities: (1) sets of keywords, (2) textual labels, and (3) image labels. Our results show that textual labels are easier for users to interpret than keywords and image labels. Moreover, the precision of retrieved documents for textual and image labels is comparable to the precision achieved by representing topics using sets of keywords, demonstrating that labelling methods are an effective alternative topic representation.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"1 1","pages":"239-248"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81653671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}