{"title":"Applying cross-data set identity reasoning for producing URI embeddings over hundreds of RDF data sets","authors":"M. Mountantonakis, Yannis Tzitzikas","doi":"10.1504/IJMSO.2021.117103","DOIUrl":"https://doi.org/10.1504/IJMSO.2021.117103","url":null,"abstract":"There is a proliferation of approaches that exploit RDF data sets for creating URI embeddings, i.e., embeddings that are produced by taking as input URI sequences (instead of simple words or phrases), since they can be of primary importance for several tasks (e.g., machine learning tasks). However, existing techniques exploit either a single or a few data sets for creating URI embeddings. For this reason, we introduce a prototype, called LODVec, which exploits LODsyndesis for enabling the creation of URI embeddings by using hundreds of data sets simultaneously, after enriching them with the results of cross-data set identity reasoning. By using LODVec, it is feasible to produce URI sequences by following paths of any length (according to a given configuration), and the produced URI sequences are used as input for creating embeddings through word2vec model. We provide comparative results for evaluating the gain of using several data sets for creating URI embeddings, for the tasks of classification and regression, and for finding the most similar entities to a given one.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122945171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantinos I. Kotis, Sotiris Angelis, M. Chondrogianni, Efstathia Marini
{"title":"Children's art museum collections as Linked Open Data","authors":"Konstantinos I. Kotis, Sotiris Angelis, M. Chondrogianni, Efstathia Marini","doi":"10.1504/ijmso.2021.10040249","DOIUrl":"https://doi.org/10.1504/ijmso.2021.10040249","url":null,"abstract":"It has been recently argued that it is rather beneficial to cultural institutions to provide their datasets as Linked Open Data, to achieve cross-referencing, interlinking, and integration with other datasets in the LOD cloud. In this paper, we present the Greek Children's Art Museum (GCAM) linked dataset, along with dataset and vocabulary statistics, as well as lessons learned from the process of transforming the collections to HTML-embedded structured data using the Europeana Data Model and the Schema.org model. The dataset consists of three cultural collections of 121 child artworks (paintings), including detailed descriptions and interlinks to external datasets. In addition to the presentation of GCAM data and the lessons learned from the experimentation of non-ICT experts with LOD paradigm, the paper introduces a new metric for measuring datasets quality in terms of links to and from other datasets.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123827228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Documenting flooding areas calculation: a PROV approach","authors":"M. D. Martino, A. Quarati, S. Rosim, L. Namikawa","doi":"10.1504/IJMSO.2021.117106","DOIUrl":"https://doi.org/10.1504/IJMSO.2021.117106","url":null,"abstract":"Flooding events related to waste-lake dam ruptures are one of the most threatening natural disasters in Brazil. They must be managed in advance by public institutions through the use of adequate hydrographic and environmental information. Although the Open Data paradigm offers an opportunity to share hydrographic data sets, their actual reuse is still low because of metadata quality. Our previous work highlighted a lack of detailed provenance information. The paper presents an Open Data approach to improve the release of hydrographic data sets. We discuss a methodology, based on W3C recommendations, for documenting the provenance of hydrographic data sets, considering the workflow activities related to the study of flood areas caused by the waste-lakes breakdowns. We provide an illustrative example that documents, through W3C PROV metadata model, the generation of flooding area maps by integrating land use classification, from Sentinel images, with hydrographic data sets produced by the Brazilian National Institute for Space Research.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114587382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An ontology-driven perspective on the emotional human reactions to social events","authors":"Danilo Cavaliere, S. Senatore","doi":"10.1504/IJMSO.2021.117104","DOIUrl":"https://doi.org/10.1504/IJMSO.2021.117104","url":null,"abstract":"Social media has become a fulcrum for sharing information on everyday-life events: people, companies, and organisations express opinions about new products, political and social situations, football matches, and concerts. The recognition of feelings and reactions to events from social networks requires dealing with great amounts of data streams, especially for tweets, to investigate the main sentiments and opinions that justify some reactions. This paper presents an emotion-based classification model to extract feelings from tweets related to an event or a trend, described by a hashtag, and build an emotional concept ontology to study human reactions to events in a context. From the tweet analysis, terms expressing a feeling are selected to build a topological space of emotion-based concepts. The extracted concepts serve to train a multi-class SVM classifier that is used to perform soft classification aimed at identifying the emotional reactions towards events. Then, an ontology allows arranging classification results, enriched with additional DBpedia concepts. SPARQL queries on the final knowledge base provide specific insights to explain people's reactions towards events. Practical case studies and test results demonstrate the applicability and potential of the approach.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116238022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Persons, GLAM institutes and collections: an analysis of entity linking based on the COURAGE registry","authors":"Ghazal Faraj, A. Micsik","doi":"10.1504/IJMSO.2021.117105","DOIUrl":"https://doi.org/10.1504/IJMSO.2021.117105","url":null,"abstract":"It is an important task to connect encyclopaedic knowledge graphs by finding and linking the same entity nodes. Various available automated linking solutions cannot be applied in situations where data is sparse, private or a high degree of correctness is expected. Wikidata has grown into a leading linking hub collecting entity identifiers from various registries and repositories. To get a picture of connectability, we analysed the linking methods and results between the COURAGE registry and Wikidata, VIAF, ISNI and ULAN. This paper describes our investigations and solutions while mapping and enriching entities in Wikidata. Each possible mapped pair of entities received a numeric score of reliability. Using this score-based matching method, we tried to minimise the need for human decisions, hence we introduced the term human decision window for the mappings where neither acceptance nor refusal can be made automatically and safely. Furthermore, Wikidata has been enriched with related COURAGE entities and bi-directional links between mapped persons, organisations, collections, and collection items. We also describe the findings on coverage and quality of mapping among the above mentioned authority databases.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"1246 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127118428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johny Moreira, Everaldo Costa Silva Neto, Luciano Barbosa
{"title":"Analysis of structured data on Wikipedia","authors":"Johny Moreira, Everaldo Costa Silva Neto, Luciano Barbosa","doi":"10.1504/ijmso.2021.10040250","DOIUrl":"https://doi.org/10.1504/ijmso.2021.10040250","url":null,"abstract":"Wikipedia has been widely used for information consumption or for implementing solutions using its content. It contains primarily unstructured text about entities, but it can also contain infoboxes...","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":" 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132124924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The future of interlinked, interoperable and scalable metadata","authors":"Getaneh Alemu, E. Garoufallou","doi":"10.1504/ijmso.2020.10030303","DOIUrl":"https://doi.org/10.1504/ijmso.2020.10030303","url":null,"abstract":"With the growing diversity of information resources the emphasis on data-centric applications such as big data, metadata, semantics and ontologies has become central. This editorial paper presents a summary of recent developments in metadata, semantics and ontologies - focusing in particular on metadata enriching, linking and interoperability. National libraries and archives are devising new bibliographic models and metadata presentation formats. Bibliographic metadata sets are being made available using these new data formats such as RDF. The new formats are aiming to represent data in granular structures and define unique identification protocols such as URIs. The paper concludes by introducing the five papers included in the special issue. The papers in this special issue present novel approaches to metadata integration, interoperability frameworks, re-use of metadata ontologies and methods of metadata quality analysis.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131105780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From the web of bibliographic data to the web of bibliographic meaning: structuring, interlinking and validating ontologies on the semantic web","authors":"H. Patrício, M. G. Cordeiro, P. Ramos","doi":"10.1504/IJMSO.2020.10030294","DOIUrl":"https://doi.org/10.1504/IJMSO.2020.10030294","url":null,"abstract":"Bibliographic data sets have revealed good levels of technical interoperability observing the principles and good practices of linked data. However, they have a low level of quality from the semantic point of view, due to many factors: lack of a common conceptual framework for a diversity of standards often used together, reduced number of links between the ontologies underlying data sets, proliferation of heterogeneous vocabularies, underuse of semantic mechanisms in data structures, \"ontology hijacking\" (Feeney et al., 2018), point-to-point mappings, as well as limitations of semantic web languages for the requirements of bibliographic data interoperability. After reviewing such issues, a research direction is proposed to overcome the misalignments found by means of a reference model and a superontology, using Shapes Constraint Language (SHACL) to solve current limitations of RDF languages.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132097960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CMDIfication process for textbook resources","authors":"F. Fallucchi, E. W. D. Luca","doi":"10.1504/ijmso.2020.10030299","DOIUrl":"https://doi.org/10.1504/ijmso.2020.10030299","url":null,"abstract":"Interoperability between heterogeneous resources and services is the key to a correctly functioning digital infrastructure, which can provide shared resources at once. We analyse the establishment of a standardised common infrastructure covering metadata, content, and inferred knowledge to allow collaborative work between researchers in the humanities. In this paper, we discuss how to provide a CMDI (Component MetaData Infrastructure) profile for textbooks, in order to integrate it into the Common Language Resources and Technology Infrastructure (CLARIN) and thus to make the data available in an open way and according to the FAIR principles. We focus on the 'CMDIfication' process, which fulfils the needs of our related projects. We describe a process of building resources using CMDI description from Text Encoding Initiative (TEI), Metadata Encoding and Transmission Standard (METS) and Dublin Core (DC) metadata, testing it on the textbook resources of the Georg Eckert Institute (GEI).","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124920380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the utility of metadata record graphs and network analysis for metadata quality evaluation and augmentation","authors":"M. Phillips, Oksana L. Zavalina, H. Tarver","doi":"10.1504/ijmso.2020.10030296","DOIUrl":"https://doi.org/10.1504/ijmso.2020.10030296","url":null,"abstract":"Our study explores the possible uses and effectiveness of network analysis, including Metadata Record Graphs, for evaluating collections of metadata records at scale. We present the results of an experiment applying these methods to records in the University of North Texas (UNT) Digital Library and two sub-collections of different compositions: the UNT Scholarly Works collection, which functions as an institutional repository, and a collection of architectural slide images. The data includes count- and value-based statistics with network metrics for every Dublin Core element in each set. The study finds that network analysis provides useful information that supplements other metrics, for example by identifying records that are completely unconnected to other items through the subject, creator, or other field values. Additionally, network density may help managers identify collections or records that could benefit from enhancement. We also discuss the constraints of these metrics and suggest possible future applications.","PeriodicalId":111629,"journal":{"name":"Int. J. Metadata Semant. Ontologies","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131134079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}