{"title":"A quantitative comparison on file folder structures of two groups of information workers","authors":"Hong Zhang, Xiao Hu","doi":"10.1109/JCDL.2014.6970237","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970237","url":null,"abstract":"This study compares file folder structures on personal computers of two groups of information workers, administrative staff and PhD students. A set of quantitative measures are calculated which disclose the differences and similarities between folder structures of the two user groups. The results shows that the group conducting more administrative activities has broader and shallower folders than the PhD group who performs more research activities, and the folders of the PhD group are more populated over deeper levels of the trees than those of the administrative group. The study improves our understanding of the various quantitative measures in investigating personal computer folder structures, and furthermore contributes to our knowledge of the information organization structure in personal information systems.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"83 1","pages":"485-486"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88480767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciro Mattia Gonano, Francesca Tomasi, Francesca Mambelli, F. Vitali, S. Peroni
{"title":"Zeri e LODE. Extracting the Zeri photo archive to linked open data: formalizing the conceptual model","authors":"Ciro Mattia Gonano, Francesca Tomasi, Francesca Mambelli, F. Vitali, S. Peroni","doi":"10.1109/JCDL.2014.6970182","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970182","url":null,"abstract":"This paper presents the first steps of a project to convert the notable Italian “Zeri photo archive” to a linked and open dataset. The full project entails the analysis of the records' description model (Scheda F) in order to define a suitable ontology by exploring existing data models, the creation of the RDF triple store, the creation of links to the cloud, and the definition of the user interface for browsing the linked open dataset. This paper presents and discusses the conceptual modeling of the data stored in the Zeri archival database.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"26 1","pages":"289-298"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74096932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-cultural mood regression for music digital libraries","authors":"Xiao Hu, Yi-Hsuan Yang","doi":"10.1109/JCDL.2014.6970230","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970230","url":null,"abstract":"Mood is a popular access point in music digital libraries and online music repositories, and is often represented as numerical values in a small number of emotion-related dimensions (e.g., valence and arousal). As music mood is recognized as culturally dependent, this study investigates whether regression models built with music data in one culture can be applied to music in another culture. Results indicate that cross-cultural predictions of both valence and arousal values are feasible.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"189 1","pages":"471-472"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76532535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pablo Barrio, Gonçalo Simões, H. Galhardas, L. Gravano
{"title":"REEL: A Relation Extraction Learning framework","authors":"Pablo Barrio, Gonçalo Simões, H. Galhardas, L. Gravano","doi":"10.1109/JCDL.2014.6970222","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970222","url":null,"abstract":"We introduce the REEL (RElation Extraction Learning) framework, an open source framework that facilitates the development and evaluation of relation extraction systems over text collections. To define a relation extraction system for a new relation and text collection, users only need to specify the parsers to load the collection, the relation and its constraints, and the learning and extraction techniques to be used. This makes REEL a powerful framework to enable the deployment and evaluation of relation extraction systems for both application building and research.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"7 1","pages":"455-456"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78646034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David A. Smith, Ryan Cordell, E. M. Dillon, Nicholas Stramp, J. Wilkerson
{"title":"Detecting and modeling local text reuse","authors":"David A. Smith, Ryan Cordell, E. M. Dillon, Nicholas Stramp, J. Wilkerson","doi":"10.5555/2740769.2740800","DOIUrl":"https://doi.org/10.5555/2740769.2740800","url":null,"abstract":"Texts propagate through many social networks and provide evidence for their structure. We describe and evaluate efficient algorithms for detecting clusters of reused passages embedded within longer documents in large collections. We apply these techniques to two case studies: analyzing the culture of free reprinting in the nineteenth-century United States and the development of bills into legislation in the U.S. Congress. Using these divergent case studies, we evaluate both the efficiency of the approximate local text reuse detection methods and the accuracy of the results. These techniques allow us to explore how ideas spread, which ideas spread, and which subgroups shared ideas.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"19 1","pages":"183-192"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78694165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaohui Wu, Jian Wu, Madian Khabsa, Kyle Williams, Hung-Hsuan Chen, W. Huang, Suppawong Tuarob, Sagnik Ray Choudhury, Alexander Ororbia, P. Mitra, C. Lee Giles
{"title":"Towards building a scholarly big data platform: Challenges, lessons and opportunities","authors":"Zhaohui Wu, Jian Wu, Madian Khabsa, Kyle Williams, Hung-Hsuan Chen, W. Huang, Suppawong Tuarob, Sagnik Ray Choudhury, Alexander Ororbia, P. Mitra, C. Lee Giles","doi":"10.1109/JCDL.2014.6970157","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970157","url":null,"abstract":"We introduce a Big Data platform that provides various services for harvesting scholarly information and enabling efficient scholarly applications. The core architecture of the platform is built on a secured private cloud, crawls data using a scholarly focused crawler that leverages a dynamic scheduler, processes by utilizing a map reduce based crawl-extraction-ingestion (CEI) workflow, and is stored in distributed repositories and databases. Services such as scholarly data harvesting, information extraction, and user information and log data analytics are integrated into the platform and provided by an OAI and RESTful API. We also introduce a set of scholarly applications built on top of this platform including citation recommendation and collaborator discovery.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"117 1","pages":"117-126"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76742877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kresimir Duretec, Artur Kulmukhametov, M. Kraxner, Markus Plangg, Christoph Becker, Luis Faria
{"title":"The SCAPE preservation lifecycle","authors":"Kresimir Duretec, Artur Kulmukhametov, M. Kraxner, Markus Plangg, Christoph Becker, Luis Faria","doi":"10.1109/JCDL.2014.6970207","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970207","url":null,"abstract":"Continuous activities such as preservation monitoring, planning and operations, including the provisioning of access mechanisms or the creation of derivatives through migration, are needed to enable continuous access to content across evolving technological contexts without affecting the authenticity of digital objects. This article describes the SCAPE preservation suite, a loosely coupled set of systems and open APIs that facilitate scalable content profiling, monitoring, planning and workflow execution.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"2 1","pages":"425-426"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76903618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Keeping your aggregative infrastructure under control","authors":"M. Artini, Claudio Atzori, P. Manghi","doi":"10.1109/JCDL.2014.6970199","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970199","url":null,"abstract":"“Aggregative Data Infrastructures” (ADIs) are systems devised to collect metadata descriptions (and files) from several data sources to construct uniform Information Spaces, hence providing cross-data source access via standard APIs or custom portals. ADIs typically deal with data collection workflows from arbitrary numbers of data sources, with heterogeneous access protocols, data exchange formats, and data models. Besides, they handle data processing work-flows for the harmonization and enrichment of aggregated metadata. Correct workflow management is crucial to ensure Information Space consistency, but is in general hard to sustain. This demo will present the solution offered in the context of the OpenAIRE infrastructure, which today collects metadata and files from around 450+ data sources (and growing) of several typologies. The D-NET Workflow Management Suite user interfaces support data curators at orchestrating overtime and in a sustainable way the configuration, execution, and monitoring of data collection and processing workflows for thousands of data sources.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"8 1","pages":"409-410"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75127532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alan Filipe Santana, Marcos André Gonçalves, Alberto H. F. Laender, Anderson A. Ferreira
{"title":"Combining domain-specific heuristics for author name disambiguation","authors":"Alan Filipe Santana, Marcos André Gonçalves, Alberto H. F. Laender, Anderson A. Ferreira","doi":"10.1109/JCDL.2014.6970165","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970165","url":null,"abstract":"Author name disambiguation has been one of the hardest problems faced by digital libraries since their early days. Historically, supervised solutions have empirically outperformed those based on heuristics, but with the burden of having to rely on manually labelled training sets for the learning process. Moreover, most supervised solutions just apply some type of generic machine learning solution and do not exploit specific knowledge about the problem. In this paper, we follow a similar reasoning, but in the opposite direction. Instead of extending an existing supervised solution, we propose a set of carefully designed heuristics and similarity functions and apply supervision only to optimize such parameters for each particular dataset. As our experiments show, the result is a very effective, efficient and practical author name disambiguation method that can be used in many different scenarios.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"23 1","pages":"173-182"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78176457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Big Brother is Watching You—But in a Good Way","authors":"Carlin St. Pierre, D. Bainbridge, Bill Rogers","doi":"10.1109/JCDL.2014.6970179","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970179","url":null,"abstract":"In any modern desktop environment the glyph compositor-where raw text information is combined with font information and other attributes to render rasterized component images-is part of the software's core functionality. In this paper we present work that shows it is computationally feasible to apply full-text indexing in real-time to the live stream of glyph compositor operations generated by a user's interaction with their desktop environment. By embedding indexing functionality at such a level, we effectively get to “see” (and more importantly remember) all the text that is drawn on the user's screen. With elements reminiscent of the Memex, we illustrate the technique in use through a personal digital library we have developed that enriches (through text-searching and context) the user's desktop experience by letting them go back in time to view information that had previously been displayed. We achieved this by augmenting our dynamically updated text index with time-stamped snapshots of the desktop. By recording the (x, y) positions of the text at the time it is rendered, the snapshots have a semi-live feel, whereby text can be selected for copy-and-paste operations for further use. Moreover, windows-even if they were hidden behind others at the time the text was rendered-can be brought to the front and their text accessed.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"20 1","pages":"277-280"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79636621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}