Najmeh Mousavi Nejad, S. Scerri, S. Auer, E. Sibarani
{"title":"EULAide: Interpretation of End-User License Agreements using Ontology-Based Information Extraction","authors":"Najmeh Mousavi Nejad, S. Scerri, S. Auer, E. Sibarani","doi":"10.1145/2993318.2993324","DOIUrl":"https://doi.org/10.1145/2993318.2993324","url":null,"abstract":"Ignoring End-User License Agreements (EULAs) for online services due to their length and complexity is a risk undertaken by the majority of online and mobile service users. This paper presents an Ontology-Based Information Extraction (OBIE) method for EULA term and phrase extraction to facilitate a better understanding by humans. An ontology capturing important terms and relationships has been developed and used to guide the OBIE process. Through a feedback cycle we have improved its domain-specific coverage by identifying additional concepts. In the detection and extraction, we focus on three key rights and conditions: permission, prohibition and duty. We present the EULAide system, which comprises a custom information extraction pipeline and a number of custom extraction rules tailored for EULA processing. To evaluate our approach, we created and manually annotated a corpus of 20 well-known licenses. For the gold standard we achieved an Inter-Annotator Agreement (IAA) of 90%, resulting in 193 permissions, 185 prohibitions and 168 duties. An evaluation of the OBIE pipeline against this gold standard resulted in an F-measure of 70-74% which, in the context of the IAA, proves the feasibility of the approach.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"307 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129572977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Dynamics and Semantics of User Interests for User Modeling on Twitter for Link Recommendations","authors":"Guangyuan Piao, J. Breslin","doi":"10.1145/2993318.2993332","DOIUrl":"https://doi.org/10.1145/2993318.2993332","url":null,"abstract":"User modeling for individual users on the Social Web plays an important role and is a fundamental step for personalization as well as recommendations. Recent studies have proposed different user modeling strategies considering various dimensions such as temporal dynamics and semantics of user interests. Although previous work proposed different user modeling strategies considering the temporal dynamics of user interests, there is a lack of comparative studies on those methods and therefore the comparative performance over each other is unknown. In terms of semantics of user interests, background knowledge from DBpedia has been explored to enrich user interest profiles so as to reveal more information about users. However, it is still unclear to what extent different types of information from DBpedia contribute to the enrichment of user interest profiles. In this paper, we propose user modeling strategies which use Concept Frequency - Inverse Document Frequency (CF-IDF) as a weighting scheme and incorporate either or both of the dynamics and semantics of user interests. To this end, we first provide a comparative study on different user modeling strategies considering the dynamics of user interests in previous literature to present their comparative performance. In addition, we investigate different types of information (i.e., categories, classes and connected entities via various properties) for entities from DBpedia and the combination of them for extending user interest profiles. Finally, we build our user modeling strategies incorporating either or both of the best-performing methods in each dimension. Results show that our strategies outperform two baseline strategies significantly in the context of link recommendations on Twitter.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131065013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edgard Marx, A. Zaveri, Diego Moussallem, Sandro Rautenberg
{"title":"DBtrends: Exploring Query Logs for Ranking RDF Data","authors":"Edgard Marx, A. Zaveri, Diego Moussallem, Sandro Rautenberg","doi":"10.1145/2993318.2993322","DOIUrl":"https://doi.org/10.1145/2993318.2993322","url":null,"abstract":"Many ranking methods have been proposed for RDF data. These methods often use the structure behind the data to measure its importance. Recently, some of these methods have started to explore information from other sources such as the Wikipedia page graph for better ranking RDF data. In this work, we propose DBtrends, a ranking function based on query logs. We extensively evaluate the application of different ranking functions for entities, classes, and properties across two different countries as well as their combination. Thereafter, we propose MIXED-RANK, a ranking function that combines DBtrends with the best-evaluated entity ranking function. We show that: (i) MIXED-RANK outperforms state-of-the-art entity ranking functions, and; (ii) query logs can be used to improve RDF ranking functions.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116863372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reginald Ford, G. Denker, D. Elenius, Wesley Moore, Elie Abi-Lahoud
{"title":"Automating Financial Regulatory Compliance Using Ontology+Rules and Sunflower","authors":"Reginald Ford, G. Denker, D. Elenius, Wesley Moore, Elie Abi-Lahoud","doi":"10.1145/2993318.2993329","DOIUrl":"https://doi.org/10.1145/2993318.2993329","url":null,"abstract":"Compliance departments in the international finance industry are struggling to use traditional methods to keep up with the demands of new and more stringent regulatory and policy requirements. One initiative supported by many institutions is definition of a common Financial Industry Business Ontology (FIBO). We regard a common ontology as an important step, but in order to support real-world uses cases, the ontology needs to be augmented, and further supplemented by rules that encode the meaning of regulations and policies. We use Sunflower, which is built on top of the Flora-2 knowledge representation languages and reasoner, to add automation to the compliance lifecycle. Sunflower is domain-agnostic, and financial regulatory compliance is one of its many application areas.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126879146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gianluca Correndo, Simon Crowle, J. Papay, M. Boniface
{"title":"Enhancing Marine Industry Risk Management Through Semantic Reconciliation of Underwater IoT Data Streams","authors":"Gianluca Correndo, Simon Crowle, J. Papay, M. Boniface","doi":"10.1145/2993318.2993330","DOIUrl":"https://doi.org/10.1145/2993318.2993330","url":null,"abstract":"The \"Rio+20\" United Nations Conference on Sustainable Development (UNCSD) focused on the \"Green economy\" as the main concept to fight poverty and achieve a sustainable way to feed the planet. For coastal countries, this concept translates into \"Blue economy\", the sustainable exploitation of marine environments to fulfill humanity needs for resources, energy, and food. This puts a stress on marine industries to better articulate their processes to gain and share knowledge of different marine habitats, and to reevaluate the data value chains established in the past and to support a data fueled market that is going only to in the near future. The EXPOSURES project is working in conjunction with the SUNRISE project to establish a new marine information ecosystem and demonstrate how the 'Internet of Things' (IoT) can be exploited for marine applications. In particular EXPOSURES engaged with the community of stakeholders in order to identify a new data value chain which includes IoT data providers, data analysts, and harbor authorities. Moreover we integrated the key technological assets that couple OGC standards for raster data management and manipulation and semantic technologies to better manage data assets. This paper presents the identified data value chain along with the use cases for validating it, and the system developed to semantically reconcile and manage such data collections.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129835832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"KESeDa: Knowledge Extraction from Heterogeneous Semi-Structured Data Sources","authors":"Martin Seidel, M. Krug, Frank Burian, M. Gaedke","doi":"10.1145/2993318.2993335","DOIUrl":"https://doi.org/10.1145/2993318.2993335","url":null,"abstract":"A large part of the free knowledge existing on the Web is available as heterogeneous, semi-structured data, which is only weakly interlinked and in general does not include any semantic classification. Due to the enormous amount of information the necessary preparation of this data for integrating it in the Web of Data requires automated processes. The extraction of knowledge from structured as well as unstructured data has already been the topic of research. But especially for the semi-structured data format JSON, which is widely used as a data exchange format e.g., in social networks, extraction solutions are missing. Based on the findings we made by analyzing existing extraction methods, we present our KESeDa approach for extracting knowledge from heterogeneous, semi-structured data sources. We show how knowledge can be extracted by describing different analysis and processing steps. With the resulting semantically enriched data the potential of Linked Data can be utilized.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122045613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Collaboration on RDF Datasets Using Git: Towards the Quit Store","authors":"Natanael Arndt, Norman Radtke, Michael Martin","doi":"10.1145/2993318.2993328","DOIUrl":"https://doi.org/10.1145/2993318.2993328","url":null,"abstract":"Collaboration is one of the most important topics regarding the evolution of the World Wide Web and thus also for the Web of Data. In scenarios of distributed collaboration on datasets it is necessary to provide support for multiple different versions of datasets to exist simultaneously, while also providing support for merging diverged datasets. In this paper we present an approach that uses SPARQL 1.1 in combination with the version control system Git, that creates commits for all changes applied to an RDF dataset containing multiple named graphs. Further the operations provided by Git are used to distribute the commits among collaborators and merge diverged versions of the dataset. We show the advantages of (public) Git repositories for RDF datasets and how this represents a way to collaborate on RDF data and consume it. With SPARQL 1.1 and Git in combination, users are given several opportunities to participate in the evolution of RDF data.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131410973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Supervised KeyPhrase Extraction System","authors":"K. Adebayo, Luigi Di Caro, G. Boella","doi":"10.1145/2993318.2993323","DOIUrl":"https://doi.org/10.1145/2993318.2993323","url":null,"abstract":"In this paper, we present a multi-featured supervised automatic keyword extraction system. We extracted salient semantic features which are descriptive of candidate keyphrases, a Random Forest classifier was used for training. The system achieved an accuracy of 58.3 % precision and has shown to outperform two top performing systems when benchmarked on a crowdsourced dataset. Furthermore, our approach achieved a personal best Precision and F-measure score of 32.7 and 25.5 respectively on the Semeval Keyphrase extraction challenge dataset. The paper describes the approaches used as well as the result obtained.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134267541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Don't compare Apples to Oranges: Extending GERBIL for a fine grained NEL evaluation","authors":"J. Waitelonis, Henrik Jürges, Harald Sack","doi":"10.1145/2993318.2993334","DOIUrl":"https://doi.org/10.1145/2993318.2993334","url":null,"abstract":"In recent years, named entity linking (NEL) tools were primarily developed as general approaches, whereas today numerous tools are focusing on specific domains such as e.g. the mapping of persons and organizations only, or the annotation of locations or events in microposts. However, the available benchmark datasets used for the evaluation of NEL tools do not reflect this focalizing trend. We have analyzed the evaluation process applied in the NEL benchmarking framework GERBIL [16] and its benchmark datasets. Based on these insights we extend the GERBIL framework to enable a more fine grained evaluation and in deep analysis of the used benchmark datasets according to different emphases. In this paper, we present the implementation of an adaptive filter for arbitrary entities as well as a system to automatically measure benchmark dataset properties, such as the extent of content-related ambiguity and diversity. The implementation as well as a result visualization are integrated in the publicly available GERBIL framework.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132655434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context","authors":"Adam Bartusiak, Jörg Lässig","doi":"10.1145/2993318.2993341","DOIUrl":"https://doi.org/10.1145/2993318.2993341","url":null,"abstract":"We present an on-going research project addressing the problem of massive amounts of unstructured data that is generated on a daily basis in most business organisations, regardless of size. Our motivation is to support in particular small and medium seized enterprises to gain a competitive advantage in the market. The goal is to improve their processes for extracting valuable business information from such disorganised data. To achieve this, we introduce a flexible and scalable data analysis framework capable of transforming various types of documents into semantically annotated structures. This includes emails, text files in various formats, slide presentations, blog entries, etc. Additionally, the solution provides a semantic search engine for structured retrieval of the analyzed information and a graphical layer to dynamically visualize the search results as an interactive graph. Throughout the paper, the architecture of two main engines that are responsible for data and text analysis and semantic search are described. We conclude that semantic processing of unstructured sources significantly improves data management and data integration within the enterprises.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132450055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}