C. Lagoze, L. Vilhuber, Jeremy Williams, B. Perry, William C. Block
{"title":"CED2AR: The Comprehensive Extensible Data Documentation and Access Repository","authors":"C. Lagoze, L. Vilhuber, Jeremy Williams, B. Perry, William C. Block","doi":"10.1109/JCDL.2014.6970178","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970178","url":null,"abstract":"We describe the design, implementation, and deployment of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR). This is a metadata repository system that allows researchers to search, browse, access, and cite confidential data and metadata through either a web-based user interface or programmatically through a search API, all the while re-reusing and linking to existing archive and provider generated metadata. CED2AR is distinguished from other metadata repository-based applications due to requirements that derive from its social science context. These include the need to cloak confidential data and metadata and manage complex provenance chains.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"140 1","pages":"267-276"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82264231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gao Pengcheng, Wu Jiangqin, Lin Yuan, Xia Yang, Mao Tianjiao, Wei Baogang
{"title":"Fast Image-based Chinese Calligraphic Character Retrieval on Large Scale Data","authors":"Gao Pengcheng, Wu Jiangqin, Lin Yuan, Xia Yang, Mao Tianjiao, Wei Baogang","doi":"10.1109/JCDL.2014.6970170","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970170","url":null,"abstract":"Chinese calligraphy is the art of handwriting, it draws a lot of attention for its beauty and elegance. In CADAL, a Calligraphic Character Dictionary (CCD) which contains hundreds of thousands of character images labeled with semantic meaning has been constructed and provided online to common users. It is a great challenge to perform quick and accurate image-based calligraphic character retrieval on CCD. In this paper, a novel shape descriptor, Oriented Shape Context (OSC) is proposed basing on the tranditional Shape Context (SC) to perform similarity searching. Together with GIST, GIST-OSC descriptor is proposed to represent calligraphic character image for efficient and effective retrieval. In addition, an effective retrieval schema is proposed. The retrieval schema works in two steps. Firstly approximate nearest neighbors of the query image are found quickly using GIST and then one-to-one fine matching between approximate nearest neighbors and the query image is performed using OSC. Our experiments show that the GIST-OSC descriptor and the retrieval schema are efficient and effective for Chinese calligraphic character retrieval on large scale data.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"12 1","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90786257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Making research data findable in digital libraries: A layered model for user-oriented indexing of survey data","authors":"Tanja Friedrich, A. Kempf","doi":"10.1109/JCDL.2014.6970150","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970150","url":null,"abstract":"The growing amount of data in research and the aspired culture of data sharing make it necessary to improve data documentation in digital libraries. On these grounds we present a conceptual model for subject indexing of research data. Taking the example of social science survey data we inquire the applicability of established indexing principles. Based on these principles our research incorporates the special characteristics of social science survey data, leading us to a model of layered subject indexing.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"73 1","pages":"53-56"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86402499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. H. Dalip, Harlley Lima, Marcos André Gonçalves, Marco Cristo, P. Calado
{"title":"Quality assessment of collaborative content with minimal information","authors":"D. H. Dalip, Harlley Lima, Marcos André Gonçalves, Marco Cristo, P. Calado","doi":"10.1109/JCDL.2014.6970169","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970169","url":null,"abstract":"Content generated by users is one of the most interesting phenomena of published media. However, the possibility of unrestricted edition is a source of doubts about its quality. This issue has motivated many studies on how to automatically assess content quality in collaborative web sites. Generally, these studies use machine learning techniques to combine large number of quality indicators into a single value representing the overall quality of the document. This need for a high number of indicators, however, has detrimental implications both on the efficiency and on the effectiveness of the quality assessment algorithms. In this work, we exploit and extend a feature selection method based on the SPEA2 multi-objective genetic algorithm. Results show that we can reduce the feature set to a fraction of 15% through 25% of the original, while obtaining error rates comparable to the state of the art.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"113 1","pages":"201-210"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79400344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recommendation based on Deduced Social Networks in an educational digital library","authors":"Monika Akbar, C. Shaffer, Weiguo Fan, E. Fox","doi":"10.1109/JCDL.2014.6970147","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970147","url":null,"abstract":"Discovering useful resources can be difficult in digital libraries with large content collections. Many educational digital libraries (edu-DLs) host thousands of resources. One approach to avoiding information overload involves modeling user behavior. But this often depends on user feedback, along with the demographic information found in user account profiles, in order to model and predict user interests. However, edu-DLs often host collections with open public access, allowing users to navigate through the system without needing to provide identification. With few identifiable users, building models linked to user accounts provides insufficient data to recommend useful resources. Analyzing user activity on a per-session basis, to deduce a latent user network, can take place even without user profiles or prior use history. The resulting Deduced Social Network (DSN) can be used to improve DL services. An example of a DSN is a graph whose nodes are sessions and whose edges connect two sessions that view the same resource. In this paper we present a recommendation framework for edu-DLs that depends on deduced connections between users. Results show that a recommendation system built from DSN-dependent parameters can improve performance compared to when only text similarity between resources is used. Our approach can potentially improve recommendation for DL resources when implicit user activities (e.g., view, click, search) are abundant but explicit user activities (e.g., account creation, rating, comment) are unavailable.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"132 1","pages":"29-38"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77409770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Creating lightweight ontologies for dataset description practical applications in a cross-domain research data management workflow","authors":"João Aguiar Castro, J. Silva, Cristina Ribeiro","doi":"10.1109/JCDL.2014.6970185","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970185","url":null,"abstract":"The description of data is a central task in research data management. Describing datasets requires deep knowledge of both the data and the data creation process to ensure adequate capture of their meaning and context. Metadata schemas are usually followed in resource description to enforce comprehensiveness and interoperability, but they can be hard to understand and adopt by researchers. We propose to address data description using ontologies, which can evolve easily, express semantics at different granularity levels and be directly used in system development. Considering that existing ontologies are often hard to use in a crossdomain research data management environment, we present an approach for creating lightweight ontologies to describe research data. We illustrate our process with two ontologies, and then use them as configuration parameters for Dendro, a software platform for research data management currently being developed at the University of Porto.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"93 1","pages":"313-316"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78079004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sukjin You, Joel DesArmo, Xiangming Mu, Sukwon Lee, Jessica C. Neal
{"title":"Visualized Related Topics (VRT) system for health information retrieval","authors":"Sukjin You, Joel DesArmo, Xiangming Mu, Sukwon Lee, Jessica C. Neal","doi":"10.1109/JCDL.2014.6970209","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970209","url":null,"abstract":"To help bridge the gap between consumer user's vocabulary and controlled vocabulary used to index health information, in this demo we implemented a Visualized Related Topics (VRT) browser system. The VRT was integrated into the “MeshMed” [2] system to support health information retrieval. The key technology behind the VRT browser is to select MeSH terms, which represent the related topics or subjects, from the top relevant documents. We rank these MeSH terms using the traditional Term Frequency-Inverse Document Frequency (TF-IDF) algorithm. The VRT browser displays a graphic representation of these MeSH terms by creating a visual where the selected MeSH terms stem from the centered user query. The design goal is provide users an overview of the key topics of the search results. In addition, VRT browser may also help users form better queries. Using the VRT browser we will be studying how to effectively assist in consumer users with their health information seeking.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"468 1","pages":"429-430"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80138336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PerCon: A personal digital library for heterogeneous data","authors":"Su Inn Park, F. Shipman","doi":"10.1109/JCDL.2014.6970155","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970155","url":null,"abstract":"Systems are needed to support access to and analysis of large heterogeneous scientific datasets. We developed PerCon, a data management and analysis environment, to support such activities. PerCon processes and integrates data gathered via queries to existing data providers to create a personal digital library of data. Users may then search, browse, visualize and annotate the data as they proceed with analysis and interpretation. Interpretation in PerCon takes place in a visual workspace in which multiple data visualizations and annotations are placed into spatial arrangements based on the current task. The system watches for patterns in the user's data selection and organization and through mixed-initiative interaction assists users by suggesting potentially relevant data from unexplored data sources. PerCon's data location and analysis capabilities were evaluated in a controlled study with 24 users. Study participants had to locate and analyze heterogeneous weather and river data with and without the visual workspace and mixed-initiative interaction, respectively. Results indicate that the visual workspace facilitated information representation and aided in the identification of relationships between datasets. The system's suggestions encouraged data exploration, leading participants to identify more evidence of correlation among data streams and more potential interactions among weather and river data.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"5 1","pages":"97-106"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89253951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of evolution descriptions from the web","authors":"Helge Holzmann, T. Risse","doi":"10.1109/JCDL.2014.6970201","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970201","url":null,"abstract":"The evolution of named entities affects exploration and retrieval tasks in digital libraries. An information retrieval system that is aware of name changes can actively support users in finding former occurrences of evolved entities. However, current structured knowledge bases, such as DBpedia or Freebase, do not provide enough information about evolutions, even though the data is available on their resources, like Wikipedia. Our Evolution Base prototype will demonstrate how excerpts describing name evolutions can be identified on these Web sites with a promising precision. The descriptions are classified by means of models that we trained based on a recent analysis of named entity evolutions on Wikipedia.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"1 1","pages":"413-414"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88079256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mood metadata for video games and interactive media","authors":"Stephanie Rossi, Jin Ha Lee, R. Clarke","doi":"10.1109/JCDL.2014.6970232","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970232","url":null,"abstract":"Video games are becoming an important part of digital library collections due to increasing popularity and the acknowledgement of their significance as cultural artifacts. In order to support robust search and browse functions, it is imperative to develop a metadata schema to effectively represent this medium. The potential of mood metadata in the domain of video game classification is little explored, despite the value given to it by gamers in user studies. Here, we present a Controlled Vocabulary (CV) for moods related to video games with 17 defined mood terms, equivalent terms, and game examples. This CV will enable catalogers to organize video games by mood, allowing mood to be used for search and collocation. In order to evaluate the applicability of this CV and discover which terms are most relevant for video games, we annotated the mood of a sample collection of 617 video game titles. In this poster, we discuss the issues and challenges we encountered in the creation and evaluation of the current CV and our future research goals.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"81 1","pages":"475-476"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75907952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}