D. H. Dalip, Harlley Lima, Marcos André Gonçalves, Marco Cristo, P. Calado
{"title":"Quality assessment of collaborative content with minimal information","authors":"D. H. Dalip, Harlley Lima, Marcos André Gonçalves, Marco Cristo, P. Calado","doi":"10.1109/JCDL.2014.6970169","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970169","url":null,"abstract":"Content generated by users is one of the most interesting phenomena of published media. However, the possibility of unrestricted edition is a source of doubts about its quality. This issue has motivated many studies on how to automatically assess content quality in collaborative web sites. Generally, these studies use machine learning techniques to combine large number of quality indicators into a single value representing the overall quality of the document. This need for a high number of indicators, however, has detrimental implications both on the efficiency and on the effectiveness of the quality assessment algorithms. In this work, we exploit and extend a feature selection method based on the SPEA2 multi-objective genetic algorithm. Results show that we can reduce the feature set to a fraction of 15% through 25% of the original, while obtaining error rates comparable to the state of the art.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"113 1","pages":"201-210"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79400344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Making research data findable in digital libraries: A layered model for user-oriented indexing of survey data","authors":"Tanja Friedrich, A. Kempf","doi":"10.1109/JCDL.2014.6970150","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970150","url":null,"abstract":"The growing amount of data in research and the aspired culture of data sharing make it necessary to improve data documentation in digital libraries. On these grounds we present a conceptual model for subject indexing of research data. Taking the example of social science survey data we inquire the applicability of established indexing principles. Based on these principles our research incorporates the special characteristics of social science survey data, leading us to a model of layered subject indexing.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"73 1","pages":"53-56"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86402499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Lagoze, L. Vilhuber, Jeremy Williams, B. Perry, William C. Block
{"title":"CED2AR: The Comprehensive Extensible Data Documentation and Access Repository","authors":"C. Lagoze, L. Vilhuber, Jeremy Williams, B. Perry, William C. Block","doi":"10.1109/JCDL.2014.6970178","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970178","url":null,"abstract":"We describe the design, implementation, and deployment of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR). This is a metadata repository system that allows researchers to search, browse, access, and cite confidential data and metadata through either a web-based user interface or programmatically through a search API, all the while re-reusing and linking to existing archive and provider generated metadata. CED2AR is distinguished from other metadata repository-based applications due to requirements that derive from its social science context. These include the need to cloak confidential data and metadata and manage complex provenance chains.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"140 1","pages":"267-276"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82264231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gao Pengcheng, Wu Jiangqin, Lin Yuan, Xia Yang, Mao Tianjiao, Wei Baogang
{"title":"Fast Image-based Chinese Calligraphic Character Retrieval on Large Scale Data","authors":"Gao Pengcheng, Wu Jiangqin, Lin Yuan, Xia Yang, Mao Tianjiao, Wei Baogang","doi":"10.1109/JCDL.2014.6970170","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970170","url":null,"abstract":"Chinese calligraphy is the art of handwriting, it draws a lot of attention for its beauty and elegance. In CADAL, a Calligraphic Character Dictionary (CCD) which contains hundreds of thousands of character images labeled with semantic meaning has been constructed and provided online to common users. It is a great challenge to perform quick and accurate image-based calligraphic character retrieval on CCD. In this paper, a novel shape descriptor, Oriented Shape Context (OSC) is proposed basing on the tranditional Shape Context (SC) to perform similarity searching. Together with GIST, GIST-OSC descriptor is proposed to represent calligraphic character image for efficient and effective retrieval. In addition, an effective retrieval schema is proposed. The retrieval schema works in two steps. Firstly approximate nearest neighbors of the query image are found quickly using GIST and then one-to-one fine matching between approximate nearest neighbors and the query image is performed using OSC. Our experiments show that the GIST-OSC descriptor and the retrieval schema are efficient and effective for Chinese calligraphic character retrieval on large scale data.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"12 1","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90786257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mood metadata for video games and interactive media","authors":"Stephanie Rossi, Jin Ha Lee, R. Clarke","doi":"10.1109/JCDL.2014.6970232","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970232","url":null,"abstract":"Video games are becoming an important part of digital library collections due to increasing popularity and the acknowledgement of their significance as cultural artifacts. In order to support robust search and browse functions, it is imperative to develop a metadata schema to effectively represent this medium. The potential of mood metadata in the domain of video game classification is little explored, despite the value given to it by gamers in user studies. Here, we present a Controlled Vocabulary (CV) for moods related to video games with 17 defined mood terms, equivalent terms, and game examples. This CV will enable catalogers to organize video games by mood, allowing mood to be used for search and collocation. In order to evaluate the applicability of this CV and discover which terms are most relevant for video games, we annotated the mood of a sample collection of 617 video game titles. In this poster, we discuss the issues and challenges we encountered in the creation and evaluation of the current CV and our future research goals.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"81 1","pages":"475-476"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75907952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SNAC: The Social Networks and Archival Context project - Towards an archival authority cooperative","authors":"R. Larson, Daniel V. Pitti, Adrian Turner","doi":"10.1109/JCDL.2014.6970208","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970208","url":null,"abstract":"Social Networks and Archival Context (SNAC) is a multi-year research and demonstration project that aims to address the longstanding research challenge of discovering, locating, and using distributed historical resources. It also seeks to redefine traditional online access points for those resources, by exposing information about the people, families, and organizations who created them in addition to their socio-historical contexts. Finally, SNAC endeavors to set the stage for a cooperative program for maintaining names of creators of archival materials, via the Encoded Archival Context - Corporate Bodies, Persons, and Families (EAC-CPF) standard. This demonstration will show the prototype access and search systems for the second phase of SNAC, incorporating over 2 million records derived from Encoded Archival Descriptions (EAD), MARC Archival Records and EAC-CPF records from over 40 repositories and consortia including the Library of Congress, ArchivesHub, Archives nationales, the Bibliothèque nationale de France (BnF), and OCLC World-Cat.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"55 1","pages":"427-428"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83819371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PerCon: A personal digital library for heterogeneous data","authors":"Su Inn Park, F. Shipman","doi":"10.1109/JCDL.2014.6970155","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970155","url":null,"abstract":"Systems are needed to support access to and analysis of large heterogeneous scientific datasets. We developed PerCon, a data management and analysis environment, to support such activities. PerCon processes and integrates data gathered via queries to existing data providers to create a personal digital library of data. Users may then search, browse, visualize and annotate the data as they proceed with analysis and interpretation. Interpretation in PerCon takes place in a visual workspace in which multiple data visualizations and annotations are placed into spatial arrangements based on the current task. The system watches for patterns in the user's data selection and organization and through mixed-initiative interaction assists users by suggesting potentially relevant data from unexplored data sources. PerCon's data location and analysis capabilities were evaluated in a controlled study with 24 users. Study participants had to locate and analyze heterogeneous weather and river data with and without the visual workspace and mixed-initiative interaction, respectively. Results indicate that the visual workspace facilitated information representation and aided in the identification of relationships between datasets. The system's suggestions encouraged data exploration, leading participants to identify more evidence of correlation among data streams and more potential interactions among weather and river data.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"5 1","pages":"97-106"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89253951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of evolution descriptions from the web","authors":"Helge Holzmann, T. Risse","doi":"10.1109/JCDL.2014.6970201","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970201","url":null,"abstract":"The evolution of named entities affects exploration and retrieval tasks in digital libraries. An information retrieval system that is aware of name changes can actively support users in finding former occurrences of evolved entities. However, current structured knowledge bases, such as DBpedia or Freebase, do not provide enough information about evolutions, even though the data is available on their resources, like Wikipedia. Our Evolution Base prototype will demonstrate how excerpts describing name evolutions can be identified on these Web sites with a promising precision. The descriptions are classified by means of models that we trained based on a recent analysis of named entity evolutions on Wikipedia.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"1 1","pages":"413-414"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88079256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Dallmeier-Tiessen, A. Lavasa, P. Herterich, L. Rueda, Rachael Kotarski, Elizabeth Newbold
{"title":"A comparative analysis of disciplinary data management workflows","authors":"S. Dallmeier-Tiessen, A. Lavasa, P. Herterich, L. Rueda, Rachael Kotarski, Elizabeth Newbold","doi":"10.1109/JCDL.2014.6970180","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970180","url":null,"abstract":"Datasets are now an integral part of scholarly communication. The result is that research data has now become a reality in library and information science, and its curation requires dedicated workflows. Here, we compare two disciplinary examples from High-Energy Physics and Humanities and Social Sciences, both referenced to the OAIS conceptual model. Even though we know that the research datasets and their metadata (preparation and curation) are very different in both disciplines, it can be seen that the conceptual workflow models are very similar, including the assignment of persistent identifiers (PIDs). The latter is particularly interesting when discussing the design and implementation of transdisciplinary services in library and information science.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"37 1","pages":"281-284"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80635679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recommendation based on Deduced Social Networks in an educational digital library","authors":"Monika Akbar, C. Shaffer, Weiguo Fan, E. Fox","doi":"10.1109/JCDL.2014.6970147","DOIUrl":"https://doi.org/10.1109/JCDL.2014.6970147","url":null,"abstract":"Discovering useful resources can be difficult in digital libraries with large content collections. Many educational digital libraries (edu-DLs) host thousands of resources. One approach to avoiding information overload involves modeling user behavior. But this often depends on user feedback, along with the demographic information found in user account profiles, in order to model and predict user interests. However, edu-DLs often host collections with open public access, allowing users to navigate through the system without needing to provide identification. With few identifiable users, building models linked to user accounts provides insufficient data to recommend useful resources. Analyzing user activity on a per-session basis, to deduce a latent user network, can take place even without user profiles or prior use history. The resulting Deduced Social Network (DSN) can be used to improve DL services. An example of a DSN is a graph whose nodes are sessions and whose edges connect two sessions that view the same resource. In this paper we present a recommendation framework for edu-DLs that depends on deduced connections between users. Results show that a recommendation system built from DSN-dependent parameters can improve performance compared to when only text similarity between resources is used. Our approach can potentially improve recommendation for DL resources when implicit user activities (e.g., view, click, search) are abundant but explicit user activities (e.g., account creation, rating, comment) are unavailable.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"132 1","pages":"29-38"},"PeriodicalIF":0.0,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77409770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}