{"title":"Creating and validating a scholarly knowledge graph using natural language processing and microtask crowdsourcing.","authors":"Allard Oelen, Markus Stocker, Sören Auer","doi":"10.1007/s00799-023-00360-7","DOIUrl":"10.1007/s00799-023-00360-7","url":null,"abstract":"<p><p>Due to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11208198/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89892513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OAVA: the open audio-visual archives aggregator","authors":"Polychronis Charitidis, Sotirios Moschos, Chrysostomos Bakouras, Stavros Doropoulos, Giorgos Makris, Nikolas Mauropoulos, Ilias Nitsos, Sofia Zapounidou, Afrodite Malliari","doi":"10.1007/s00799-023-00384-z","DOIUrl":"https://doi.org/10.1007/s00799-023-00384-z","url":null,"abstract":"<p>The purpose of the current article is to provide an overview of an open-access audiovisual aggregation and search service platform developed for Greek audiovisual content during the OAVA (Open Access AudioVisual Archive) project. The platform allows the search of audiovisual resources utilizing metadata descriptions, as well as full-text search utilizing content generated from automatic speech recognition (ASR) processes through deep learning models. A dataset containing reliable Greek audiovisual content providers and their resources (1710 in total) is created. Both providers and resources are reviewed according to specific criteria already established and used for content aggregation purposes, to ensure the quality of the content and to avoid copyright infringements. Well-known aggregation services and well-established schemas for audiovisual resources have been studied and considered regarding both aggregated content and metadata. Most Greek audiovisual content providers do not use established metadata schemas when publishing their content, nor technical cooperation with them is guaranteed. Thus, a model is developed for reconciliation and aggregation. To utilize audiovisual resources the OAVA platform makes use of the latest state-of-the-art ASR approaches. OAVA platform supports Greek and English speech-to-text models. Specifically for Greek, to mitigate the scarcity of available datasets, a large-scale ASR dataset is annotated to train and evaluate deep learning architectures. The result of the above-mentioned efforts, namely selection of content, metadata, development of appropriate ASR techniques, and aggregation and enrichment of content and metadata, is the OAVA platform. This unified search mechanism for Greek audiovisual content will serve teaching, research, and cultural activities. OAVA platform is available at: https://openvideoarchives.gr/.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138686808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan Colin Gibson, Sudatta Chowdhury, Gobinda Chowdhury
{"title":"User versus institutional perspectives of metadata and searching: an investigation of online access to cultural heritage content during the COVID-19 pandemic","authors":"Ryan Colin Gibson, Sudatta Chowdhury, Gobinda Chowdhury","doi":"10.1007/s00799-023-00385-y","DOIUrl":"https://doi.org/10.1007/s00799-023-00385-y","url":null,"abstract":"<p>Findings from log analyses of user interactions with the digital content of two large national cultural heritage institutions (National Museums of Scotland and National Galleries of Scotland) during the COVID-19 lockdown highlighted limited engagement compared to pre-pandemic levels. Just 8% of users returned to these sites, whilst the average time spent, and number of pages accessed, were generally low. This prompted a user study to investigate the potential mismatch between the way content was indexed by the curators and searched for by users. A controlled experiment with ten participants, involving two tasks and a selected set of digital cultural heritage content, explored: (a) how does the metadata assigned by cultural heritage organisations meet or differ from the search needs of users? and (b) how can the search strategies of users inform the search pathways employed by cultural heritage organisations? Findings reveal that collection management standards like <i>Spectrum</i> encourage a variety of different characteristics to be considered when developing metadata, yet much of the content is left to the interpretations of curators. Rather, user- and context-specific guidelines could be beneficial in ensuring the aspects considered most important by consumers are indexed, thereby producing more relevant search results. A user-centred approach to designing cultural heritage websites would help to improve an individual’s experience when searching for information. However, a process is needed for institutions to form a concrete understanding of who their target users are before developing features and designs to suit their specific needs and interests.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138686506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing the examination of obstacles in an automated peer review system","authors":"Gustavo Lúcius Fernandes, Pedro O. S. Vaz-de-Melo","doi":"10.1007/s00799-023-00382-1","DOIUrl":"https://doi.org/10.1007/s00799-023-00382-1","url":null,"abstract":"<p>The peer review process is the main academic resource to ensure that science advances and is disseminated. To contribute to this important process, classification models were created to perform two tasks: the <i>review score prediction</i> (<i>RSP</i>) and the <i>paper decision prediction</i> (<i>PDP</i>). But what challenges prevent us from having a fully efficient system responsible for these tasks? And how far are we from having an automated system to take care of these two tasks? To answer these questions, in this work, we evaluated the general performance of existing state-of-the-art models for <i>RSP</i> and <i>PDP</i> tasks and investigated what types of instances these models tend to have difficulty classifying and how impactful they are. We found, for example, that the performance of a model to predict the final decision of a paper is 23.31% lower when it is exposed to difficult instances and that the classifiers make mistake with a very high confidence. These and other results lead us to conclude that there are groups of instances that can negatively impact the model’s performance. That way, the current state-of-the-art models have potential to helping editors to decide whether to approve or reject a paper; however, we are still far from having a system that is fully responsible for scoring a paper and decide if it will be accepted or rejected.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Focused Issue on Digital Library Challenges to Support the Open Science Process","authors":"Giorgio Maria Di Nunzio","doi":"10.1007/s00799-023-00388-9","DOIUrl":"https://doi.org/10.1007/s00799-023-00388-9","url":null,"abstract":"<p>Open Science is the broad term that involves several aspects aiming to remove the barriers for sharing any kind of output, resources, methods or tools, at any stage of the research process (https://book.fosteropenscience.eu/en/). The Open Science process is a set of transparent research practices that help to improve the quality of scientific knowledge and are crucial to the most basic aspects of the scientific process by means of the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. Thanks to research transparency and accessibility, we can evaluate the credibility of scientific claims and make the research process reproducible and the obtained results replicable. In this context, digital libraries play a pivotal role in supporting the Open Science process by facilitating the storage, organization, and dissemination of research outputs, including open access publications and open data. In this focused issue, we invited researchers to discuss innovative solutions, also related to technical challenges, about the identifiability of digital objects as well as the use of metadata and ontologies in order to support replicable and reusable research, the adoption of standards and semantic technologies to link information, and the evaluation of the application of the FAIR principles.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marina Salse-Rovira, Nuria Jornet-Benito, Javier Guallar, Maria Pilar Mateo-Bretos, Josep Oriol Silvestre-Canut
{"title":"Universities, heritage, and non-museum institutions: a methodological proposal for sustainable documentation","authors":"Marina Salse-Rovira, Nuria Jornet-Benito, Javier Guallar, Maria Pilar Mateo-Bretos, Josep Oriol Silvestre-Canut","doi":"10.1007/s00799-023-00383-0","DOIUrl":"https://doi.org/10.1007/s00799-023-00383-0","url":null,"abstract":"Abstract To provide a sustainable methodology for documenting the small (and underfunded) but often important university heritage collections. The sequence proposed by the DBLC (Database Life Cycle) (Coronel and Morris, Database Systems: Design, Implementation, & Management. Cengage Learning, Boston, 2018; Oppel Databases a beginner’s guide. McGraw-Hill, New York, 2009) is followed, focusing on the database design phase. The resulting proposals aim at harmonising the different documentation tools developed by GLAM institutions (acronym that aims to highlight the common aspects of Galleries, Libraries, Archives and Museums), all of which are present in the university environment. The work phases are based mainly on the work of Valle, Fernández Cacho, and Arenillas (Muñoz Cruz et al. Introducción a la documentación del patrimonio cultural. Consejería de Cultura de la Junta de Andalucía, Seville, 2017), combined with the experience acquired from the creation of the virtual museum at our institution. The creation of a working team that includes university staff members is recommended because we believe that universities have sufficient power to manage their own heritage. For documentation, we recommend the use of application profiles that consider the new trends in semantic web and LOD (Linked Open Data) and that are created using structural interchange standards such as Dublin Core, LIDO, or Darwin Core, which should be combined with content and value standards adapted from the GLAM area. The application of the methodology described above will make it possible to obtain quality metadata in a sustainable way given the limited resources of university collections. A proposed metadata schema is provided as an annex.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136235408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Libraries, Epigraphy and Paleography: Bring Records from the Distant Past to the Present: Part II","authors":"Stephen M. Griffin","doi":"10.1007/s00799-023-00381-2","DOIUrl":"https://doi.org/10.1007/s00799-023-00381-2","url":null,"abstract":"Abstract The two volumes of this Special Issue explore the intersections of digital libraries, epigraphy and paleography. Digital libraries research, practices and infrastructures have transformed the study of ancient inscriptions by providing organizing principles for collections building, defining interoperability requirements and developing innovative user tools and services. Yet linking collections and their contents to support advanced scholarly work in epigraphy and paleography tests the limits of current digital libraries applications. This is due, in part, to the magnitude and heterogeneity of works created over a time period of more than five millennia. The remarkable diversity ranges from the types of artifacts to the methods used in their production to the singularity of individual marks contained within them. Conversion of analogue collections to digital repositories is well underway—but most often not in a way that meets the basic requirements needed to support scholarly workflows. This is beginning to change as collections and content are being described more fully with rich annotations and metadata conforming to established standards. New use of imaging technologies and computational approaches are remediating damaged works and revealing text that has, over time, become illegible or hidden. Transcription of handwritten text to machine-readable form is still primarily a manual process, but research into automated transcription is moving forward. Progress in digital libraries research and practices coupled with collections development of ancient writtten works suggests that epigraphy and paleography will gain new prominence in the Academy.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical developments for the Homer Multitext: palaeography, orthography, morphology, prosody, semantics","authors":"Neel Smith, Christopher Blackwell","doi":"10.1007/s00799-023-00380-3","DOIUrl":"https://doi.org/10.1007/s00799-023-00380-3","url":null,"abstract":"","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135304948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kritika Garg, Himarsha R. Jayanetti, Sawood Alam, Michele C. Weigle, Michael L. Nelson
{"title":"Challenges in replaying archived Twitter pages","authors":"Kritika Garg, Himarsha R. Jayanetti, Sawood Alam, Michele C. Weigle, Michael L. Nelson","doi":"10.1007/s00799-023-00379-w","DOIUrl":"https://doi.org/10.1007/s00799-023-00379-w","url":null,"abstract":"","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86163407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graduate student search strategies within academic digital libraries","authors":"O. Hoeber, D. Storie","doi":"10.1007/s00799-023-00378-x","DOIUrl":"https://doi.org/10.1007/s00799-023-00378-x","url":null,"abstract":"","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85877053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}