Francesco Mambrini, M. Passarotti, Eleonora Litta, Giovanni Moretti
{"title":"Interlinking Valency Frames and WordNet Synsets in the LiLa Knowledge Base of Linguistic Resources for Latin","authors":"Francesco Mambrini, M. Passarotti, Eleonora Litta, Giovanni Moretti","doi":"10.3233/ssw210032","DOIUrl":"https://doi.org/10.3233/ssw210032","url":null,"abstract":"This paper describes the steps taken to model a valency lexicon for Latin (Latin Vallex) according to the principles of the Linked Data paradigm, and to interlink its valency frames with the lexical senses recorded in a manually checked subset of the Latin WordNet. The valency lexicon and the WordNet share lexical entries and are part of the LiLa Knowledge Base, which interlinks multiple linguistic resources for Latin. After describing the overall architecture of LiLa, as well as the structure of the lexical entries of Latin Vallex and Latin WordNet, the paper focuses on how valency frames have been modeled in LiLa, in line with a submodule of the Predicate Model for Ontologies (PreMOn) specifically created for the representation of grammatical valency. A mapping of the valency frames and the WordNet synsets assigned to the lexical entries shared by the two resources is detailed, as well as a number of queries that can be run across the interoperable resources for Latin currently included in LiLa.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"2009 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127324653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Cutrona, Gianluca Puleri, Federico Bianchi, M. Palmonari
{"title":"NEST: Neural Soft Type Constraints to Improve Entity Linking in Tables","authors":"Vincenzo Cutrona, Gianluca Puleri, Federico Bianchi, M. Palmonari","doi":"10.3233/ssw210033","DOIUrl":"https://doi.org/10.3233/ssw210033","url":null,"abstract":"Matching tables against Knowledge Graphs is a crucial task in many applications. A widely adopted solution to improve the precision of matching algorithms is to refine the set of candidate entities by their type in the Knowledge Graph. However, it is not rare that a type is missing for a given entity. In this paper, we propose a methodology to improve the refinement phase of matching algorithms based on type prediction and soft constraints. We apply our methodology to state-of-the-art algorithms, showing a performance boost on different datasets.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126716058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalizing Type-Based Facet Ranking Using BERT Embeddings","authors":"Esraa Ali, A. Caputo, S. Lawless, Owen Conlan","doi":"10.3233/ssw210040","DOIUrl":"https://doi.org/10.3233/ssw210040","url":null,"abstract":"In Faceted Search Systems (FSS), users navigate the information space through facets, which are attributes or meta-data that describe the underlying content of the collection. Type-based facets (aka t-facets) help explore the categories associated with the searched objects in structured information space. This work investigates how personalizing t-facet ranking can minimize user effort to reach the intended search target. We propose a lightweight personalisation method based on Vector Space Model (VSM) for ranking the t-facet hierarchy in two steps. The first step scores each individual leaf-node t-facet by computing the similarity between the t-facet BERT embedding and the user profile vector. In this model, the user’s profile is expressed in a category space through vectors that capture the users’ past preferences. In the second step, this score is used to re-order and select the sub-tree to present to the user. The final ranked tree reflects the t-facet relevance both to the query and the user profile. Through the use of embeddings, the proposed method effectively handles unseen facets without adding extra processing to the FSS. The effectiveness of the proposed approach is measured by the user effort required to retrieve the sought item when using the ranked facets. The approach outperformed existing personalization baselines.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130008843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TODO: A Core Ontology for Task-Oriented Dialogue Systems in Industry 4.0","authors":"Cristina Aceta, Izaskun Fernández, Aitor Soroa Etxabe","doi":"10.3233/ssw210031","DOIUrl":"https://doi.org/10.3233/ssw210031","url":null,"abstract":"Nowadays, the demand in industry of dialogue systems to be able to naturally communicate with industrial systems is increasing, as they allow to enhance productivity and security in these scenarios. However, adapting these systems to different use cases is a costly process, due to the complexity of the scenarios and the lack of available data. This work presents the Task-Oriented Dialogue management Ontology (TODO), which aims to provide a core and complete base for semantic-based task-oriented dialogue systems in the context of industrial scenarios in terms of, on the one hand, domain and dialogue modelling and, on the other hand, dialogue management and tracing support. Furthermore, its modular structure, besides grouping specific knowledge in independent components, allows to easily extend each of the modules, attending the necessities of the different use cases. These characteristics allow an easy adaptation of the ontology to different use cases, with a considerable reduction of time and costs. So as to demonstrate the capabilities of the the ontology by integrating it in a task-oriented dialogue system, TODO has been validated in real-world use cases. Finally, an evaluation is also presented, covering different relevant aspects of the ontology.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124797021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Lieber, Dylan Van Assche, Sally Chambers, F. Messens, Friedel Geeraert, Julie M. Birkholz, Anastasia Dimou
{"title":"BESOCIAL: A Sustainable Knowledge Graph-Based Workflow for Social Media Archiving","authors":"S. Lieber, Dylan Van Assche, Sally Chambers, F. Messens, Friedel Geeraert, Julie M. Birkholz, Anastasia Dimou","doi":"10.3233/ssw210045","DOIUrl":"https://doi.org/10.3233/ssw210045","url":null,"abstract":"Social media as infrastructure for public discourse provide valuable information that needs to be preserved. Several tools for social media harvesting exist, but still only fragmented workflows may be formed with different combinations of such tools. On top of that, social media data but also preservation-related metadata standards are heterogeneous, resulting in a costly manual process. In the framework of BESOCIAL at the Royal Library of Belgium (KBR), we develop a sustainable social media archiving workflow that integrates heterogeneous data sources in a Europeana and PREMIS-based data model to describe data preserved by open source tools. This allows data stewardship on a uniform representation and we generate metadata records automatically via queries. In this paper, we present a comparison of social media harvesting tools and our Knowledge Graph-based solution which reuses off-the-shelf open source tools to harvest social media and automatically generate preservation-related metadata records. We validate our solution by generating Encoded Archival Description (EAD) and bibliographic MARC records for preservation of harvested social media collections from Twitter collected at KBR. Other archiving institutions can build upon our solution and customize it to their own social media archiving policies.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125643462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hashim Khan, Manzoor Ali, A. N. Ngomo, Muhammad Saleem
{"title":"When is the Peak Performance Reached? An Analysis of RDF Triple Stores","authors":"Hashim Khan, Manzoor Ali, A. N. Ngomo, Muhammad Saleem","doi":"10.3233/ssw210042","DOIUrl":"https://doi.org/10.3233/ssw210042","url":null,"abstract":"With significant growth in RDF datasets, application developers demand online availability of these datasets to meet the end users’ expectations. Various interfaces are available for querying RDF data using SPARQL query language. Studies show that SPARQL end-points may provide high query runtime performance at the cost of low availability. For example, it has been observed that only 32.2% of public endpoints have a monthly uptime of 99–100%. One possible reason for this low availability is the high workload experienced by these SPARQL endpoints. As complete query execution is performed at server side (i.e., SPARQL endpoint), this high query processing workload may result in performance degradation or even a service shutdown. We performed extensive experiments to show the query processing capabilities of well-known triple stores by using their SPARQL endpoints. In particular, we stressed these triple stores with multiple parallel requests from different querying agents. Our experiments revealed the maximum query processing capabilities of these triple stores after which point they lead to service shutdowns. We hope this analysis will help triple store developers to design workload-aware RDF engines to improve the availability of their public endpoints with high throughput.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127236081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a Data Processing Activities Catalog: Representing Heterogeneous Compliance-Related Information for GDPR Using DCAT-AP and DPV","authors":"P. Ryan, H. Pandit, Rob Brennan","doi":"10.3233/ssw210043","DOIUrl":"https://doi.org/10.3233/ssw210043","url":null,"abstract":"This paper describes a new semantic metadata-based approach to describing and integrating diverse data processing activity descriptions gathered from heterogeneous organisational sources such as departments, divisions, and external processors. This information must be collated to assess and document GDPR legal compliance, such as creating a Register of Processing Activities (ROPA). Most GDPR knowledge graph research to date has focused on developing detailed compliance graphs. However, many organisations already have diverse data collection tools for documenting data processing activities, and this heterogeneity is likely to grow in the future. We provide a new approach extending the well-known DCAT-AP standard utilising the data privacy vocabulary (DPV) to express the concepts necessary to complete a ROPA. This approach enables data catalog implementations to merge and federate the metadata for a ROPA without requiring full alignment or merging all the underlying data sources. To show our approach’s feasibility, we demonstrate a deployment use case and develop a prototype system based on diverse data processing records and a standard set of SPARQL queries for a Data Protection Officer preparing a ROPA to monitor compliance. Our catalog’s key benefits are that it is a lightweight, metadata-level integration point with a low cost of compliance information integration, capable of representing processing activities from heterogeneous sources.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130194865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christof Bless, Lukas Dötlinger, Michael Kaltschmid, Markus Reiter, Anelia Kurteva, Antonio J. Roa-Valverde, A. Fensel
{"title":"Raising Awareness of Data Sharing Consent Through Knowledge Graph Visualisation","authors":"Christof Bless, Lukas Dötlinger, Michael Kaltschmid, Markus Reiter, Anelia Kurteva, Antonio J. Roa-Valverde, A. Fensel","doi":"10.3233/ssw210034","DOIUrl":"https://doi.org/10.3233/ssw210034","url":null,"abstract":"Knowledge graphs facilitate systematic large-scale data analysis by providing both human and machine-readable structures, which can be shared across different domains and platforms. Nowadays, knowledge graphs can be used to standardise the collection and sharing of user information in many different sectors such as transport, insurance, smart cities and internet of things. Regulations such as the GDPR make sure that users are not taken advantage of when they share data. From a legal standpoint it is necessary to have the user’s consent to collect information. This consent is only valid if the user is aware about the information collected at all times. To increase this awareness, we present a knowledge graph visualisation approach, which informs users about the activities linked to their data sharing agreements, especially after they have already given their consent. To visualise the graph, we introduce a user-centred application which showcases sensor data collection and distribution to different data processors. Finally, we present the results of a user study conducted to find out whether this visualisation leads to more legal awareness and trust. We show that with our visualisation tool data sharing consent rates increase from 48% to 81.5%.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121746543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing RDF Stream Processing for Uncertainty Management","authors":"Robin Keskisärkkä, E. Blomqvist, O. Hartig","doi":"10.3233/ssw210039","DOIUrl":"https://doi.org/10.3233/ssw210039","url":null,"abstract":"RDF Stream Processing (RSP) has been proposed as a way of bridging the gap between the Complex Event Processing (CEP) paradigm and the Semantic Web standards. Uncertainty has been recognized as a critical aspect in CEP, but it has received little attention within the context of RSP. In this paper, we investigate the impact of different RSP optimization strategies for uncertainty management. The paper describes (1) an extension of the RSP-QL⋆ data model to capture bind expressions, filter expressions, and uncertainty functions; (2) optimization techniques related to lazy variables and caching of uncertainty functions, and a heuristic for reordering uncertainty filters in query plans; and (3) an evaluation of these strategies in a prototype implementation. The results show that using a lazy variable mechanism for uncertainty functions can improve query execution performance by orders of magnitude while introducing negligible overhead. The results also show that caching uncertainty function results can improve performance under most conditions, but that maintaining this cache can potentially add overhead to the overall query execution process. Finally, the effect of the proposed heuristic on query execution performance was shown to depend on multiple factors, including the selectivity of uncertainty filters, the size of intermediate results, and the cost associated with the evaluation of the uncertainty functions.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121791085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Vollmers, Rricha Jalota, Diego Moussallem, Hardik Topiwala, A. N. Ngomo, Ricardo Usbeck Data Science Group, Paderborn University, H Germany, Fraunhofer Iais, Dresden.
{"title":"Knowledge Graph Question Answering using Graph-Pattern Isomorphism","authors":"Daniel Vollmers, Rricha Jalota, Diego Moussallem, Hardik Topiwala, A. N. Ngomo, Ricardo Usbeck Data Science Group, Paderborn University, H Germany, Fraunhofer Iais, Dresden.","doi":"10.3233/SSW210038","DOIUrl":"https://doi.org/10.3233/SSW210038","url":null,"abstract":"Knowledge Graph Question Answering (KGQA) systems are often based on machine learning algorithms, requiring thousands of question-answer pairs as training examples or natural language processing pipelines that need module fine-tuning. In this paper, we present a novel QA approach, dubbed TeBaQA. Our approach learns to answer questions based on graph isomorphisms from basic graph patterns of SPARQL queries. Learning basic graph patterns is efficient due to the small number of possible patterns. This novel paradigm reduces the amount of training data necessary to achieve state-of-the-art performance. TeBaQA also speeds up the domain adaption process by transforming the QA system development task into a much smaller and easier data compilation task. In our evaluation, TeBaQA achieves state-of-the-art performance on QALD-8 and delivers comparable results on QALD-9 and LC-QuAD v1. Additionally, we performed a fine-grained evaluation on complex queries that deal with aggregation and superlative questions as well as an ablation study, highlighting future research challenges.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125628624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}