{"title":"Making literature review and manuscript writing tasks easier for novice researchers through Rec4LRW system","authors":"Aravind Sesagiri Raamkumar, S. Foo, N. Pang","doi":"10.1145/2910896.2925445","DOIUrl":"https://doi.org/10.1145/2910896.2925445","url":null,"abstract":"We demonstrate the recently built Rec4LRW system, meant for assisting researchers in three literature review and manuscript writing tasks. The system has been designed to be useful for all researchers, albeit the evaluation results show that it is more beneficial for research students and beginners. In this demonstration, we provide a walkthrough of the system by executing the tasks with sample research topics. The unique User-Interface (UI) and the task interconnectivity features are some of the highlighted aspects.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131731702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Curve separation for line graphs in scholarly documents","authors":"Sagnik Ray Choudhury, Shuting Wang, C. Lee Giles","doi":"10.1145/2910896.2925469","DOIUrl":"https://doi.org/10.1145/2910896.2925469","url":null,"abstract":"Line graphs are abundant in scholarly papers. They are usually generated from a data table and that data can not be accessed. One important step in an automated data extraction pipeline is the curve separation problem: segmenting the pixels into separate curves. Previous work in this domain has focused on raster graphics extracted from scholarly PDFs, whereas most scholarly plots are embedded as vector graphics. We report a system to extract these plots as SVG images and show how that can improve both the accuracy (90%) and the scalability (5-8 seconds) of the curve separation problem.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116315360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Hinze, Michael Coleman, S. Cunningham, D. Bainbridge
{"title":"Semantic bookworm: Mining literary resources revisited","authors":"A. Hinze, Michael Coleman, S. Cunningham, D. Bainbridge","doi":"10.1145/2910896.2925444","DOIUrl":"https://doi.org/10.1145/2910896.2925444","url":null,"abstract":"In this paper, we describe Semantic Bookworm - a tool that supports scholarly text analysis. In contrast to the text-based Bookworm tool, the Semantic Bookworm identifies semantic concepts.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116132538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Games for crowdsourcing mobile content: An analysis of contribution patterns","authors":"D. Goh, Ei Pa Pa Pe-Than, C. S. Lee","doi":"10.1145/2910896.2925455","DOIUrl":"https://doi.org/10.1145/2910896.2925455","url":null,"abstract":"Crowdsourcing of mobile content through games is becoming a major way of populating information-rich online environments. A current research gap is that actual usage patterns of crowdsourcing games has been inadequately investigated. We address this gap by comparing content creation patterns in a game for crowdsourcing mobile content against a non-game version. Results show distinct differences in the types and distribution of content created.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129836326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ArchiveSpark: Efficient Web archive access, extraction and derivation","authors":"Helge Holzmann, V. Goel, Avishek Anand","doi":"10.1145/2910896.2910902","DOIUrl":"https://doi.org/10.1145/2910896.2910902","url":null,"abstract":"Web archives are a valuable resource for researchers of various disciplines. However, to use them as a scholarly source, researchers require a tool that provides efficient access to Web archive data for extraction and derivation of smaller datasets. Besides efficient access we identify five other objectives based on practical researcher needs such as ease of use, extensibility and reusability. Towards these objectives we propose ArchiveSpark, a framework for efficient, distributed Web archive processing that builds a research corpus by working on existing and standardized data formats commonly held by Web archiving institutions. Performance optimizations in ArchiveSpark, facilitated by the use of a widely available metadata index, result in significant speed-ups of data processing. Our benchmarks show that ArchiveSpark is faster than alternative approaches without depending on any additional data stores while improving usability by seamlessly integrating queries and derivations with external tools.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114801889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quality assessment of Wikipedia articles without feature engineering","authors":"Quang-Vinh Dang, C. Ignat","doi":"10.1145/2910896.2910917","DOIUrl":"https://doi.org/10.1145/2910896.2910917","url":null,"abstract":"As Wikipedia became the largest human knowledge repository, quality measurement of its articles received a lot of attention during the last decade. Most research efforts focused on classification of Wikipedia articles quality by using a different feature set. However, so far, no “golden feature set” was proposed. In this paper, we present a novel approach for classifying Wikipedia articles by analysing their content rather than by considering a feature set. Our approach uses recent techniques in natural language processing and deep learning, and achieved a comparable result with the state-of-the-art.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131667858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating cluster stability when analyzing transaction logs","authors":"D. Grech, Paul D. Clough","doi":"10.1145/2910896.2910923","DOIUrl":"https://doi.org/10.1145/2910896.2910923","url":null,"abstract":"Data-driven approaches have become increasingly popular as a means for analyzing transaction logs from web search engines and digital libraries, for example using cluster analysis to identify common patterns of search and navigation behavior. However, steps must be taken to ensure that results are reliable and repeatable. Although clustering patterns of user interaction behavior has been previously explored, one aspect that has received less attention is cluster stability that can be used to aid cluster validation. In this paper we compute stability based on the Jaccard coefficient to investigate the cluster stability when using different subsets of transaction log data from WorldCat.org. Results provide insights into different types of search behaviors and highlight that clusters of varying degrees of stability will result from the clustering process. However, we show that additional investigation beyond the results of cluster stability is required to fully validate the resulting clusters.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133958484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Content selection and curation for web archiving: The gatekeepers vs. the masses","authors":"Ian Milligan, Nick Ruest, Jimmy J. Lin","doi":"10.1145/2910896.2910913","DOIUrl":"https://doi.org/10.1145/2910896.2910913","url":null,"abstract":"Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question “what should we archive?” The Internet Archive's broad entire-web crawls have been supplemented by narrower domain-or topic-specific collections gathered by numerous libraries. We can characterize this as content selection and curation by “gatekeepers”. In contrast, we have witnessed the emergence of another approach driven by “the masses” - we can archive pages that are contained in social media streams such as Twitter. The interesting question, of course, is how these approaches differ. We provide an answer to this question in the context of a case study about the 2015 Canadian federal elections. Based on our analysis, we recommend a hybrid approach that combines an effort driven by social media and more traditional curatorial methods.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117214290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Question identification and classification on an academic question answering site","authors":"B. Ojokoh, Tobore Igbe, A. Araoye, Friday Ameh","doi":"10.1145/2910896.2925442","DOIUrl":"https://doi.org/10.1145/2910896.2925442","url":null,"abstract":"Online communities such as wikis, blogs, forums, scientific communities and other social networking services have enabled new levels of interactions and interconnections among individuals, documents and data and have become places for people to seek and share expertise. In this paper, we propose a systematic approach to identification and classification of questions. The questions were first identified using semantic occurrence of Part of Speech (POS) tag in English Language, after which they were classified based on maximum probability value of Naïve Bayes classification. The model was validated and evaluated with experiments on some crawled web pages from ResearchGate.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128143877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The state of practice and use of digital collections: The digital public library of America as a platform for research","authors":"R. Frick","doi":"10.1145/2910896.2926741","DOIUrl":"https://doi.org/10.1145/2910896.2926741","url":null,"abstract":"Summary from only given. In 2016, Digital Public Library of America is celebrating the third year of its cultural heritage metadata aggregator service. Since its launch, the DPLA collection has grown to represent over 13 million objects and over 1900 institutions, from small historical societies to large research libraries. With onramps, or hubs, in over 20 states, DPLA is well on its way to complete the coverage map by the end of 2017. As it continues to build this amazing dataset, DPLA is taking the time to examine what lessons are to be learned from this unprecedented resource, as the organization's sustainability is directly tied to how the collection grows, how it measures use, and proving its value to the communities it serves. What does this collection data tell us about the state of bibliographic holdings information, and the knowledge and skills and abilities of those who create records, not just for local use, but for use in other environments and contexts? How well does the metadata perform when it leaves its original context? Working with colleagues at Europeana, DPLA has begun investigating and addressing the problematic issues regarding access and reuse of digital objects in the collective by examining current ways rights are expressed in the metadata, working towards standardization of this information. Ms. Frick will discuss DPLA's rights work, as well as other potential areas of research and DPLA's strategy for future growth.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128871418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}