L. Piras, M. Al-Obeidallah, Michalis Pavlidis, H. Mouratidis, A. Tsohou, E. Magkos, Andrea Praitano
{"title":"A Data Scope Management Service to Support Privacy by Design and GDPR Compliance","authors":"L. Piras, M. Al-Obeidallah, Michalis Pavlidis, H. Mouratidis, A. Tsohou, E. Magkos, Andrea Praitano","doi":"10.26421/JDI2.2-3","DOIUrl":"https://doi.org/10.26421/JDI2.2-3","url":null,"abstract":"In order to empower user data protection and user rights, the European General Data Protection Regulation (GDPR) has been enforced. On the positive side, the user is obtaining advantages from GDPR. However, organisations are facing many difficulties in interpreting GDPR, and to properly applying it, and, in the meanwhile, due to their lack of compliance, many organisations are receiving huge fines from authorities. An important challenge is compliance with the Privacy by Design and by default (PbD) principles, which require that data protection is integrated into processing activities and business practices from the design stage. Recently, the European Data Protection Board (EDPB) released an official document with PbD guidelines, and there are various efforts to provide approaches to support these. However, organizations are still facing difficulties in identifying a flow for executing, in a coherent, linear and effective way, these activities, and a complete toolkit for supporting this. In this paper, we propose the design of such flow, and our comprehensive supporting toolkit, as part of the DEFeND EU Project platform. Within DEFeND, we identified candidate tools, fulfilling specific GDPR aspects, and integrated them in a comprehensive toolkit: the DEFeND Data Scope Management service (DSM). The aim of DSM is to support organizations for continuous GDPR compliance through model-based Privacy by Design analysis. Here, we present DSM, its design, flow, and a preliminary case study and evaluation performed with pilots from the healthcare, banking, public administration and energy sectors.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122115239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kostas Kolomvatsos, Maria Kalouda, Panagiota Papadopoulou, S. Hadjiefthymiades
{"title":"Fuzzy Trust Modelling for Pervasive Computing Applications","authors":"Kostas Kolomvatsos, Maria Kalouda, Panagiota Papadopoulou, S. Hadjiefthymiades","doi":"10.26421/JDI2.2-1","DOIUrl":"https://doi.org/10.26421/JDI2.2-1","url":null,"abstract":"Pervasive computing applications involve the interaction between autonomous entities for performing complex tasks and producing knowledge. Autonomous entities can interact to exchange data and knowledge to fulfil applications requirements. Intelligent Agents (IAs) ‘activated’ in various devices offer a lot of advantages when representing such entities due to their autonomous nature that enables them to perform the desired tasks in a distributed way. However, in such open and dynamic environments, IAs should be based on an efficient mechanism for trusting unknown entities when exchanging data. The trust level of an entity should be automatically calculated based on an efficient methodology. Each entity is uncertain for the characteristics and the intentions of the others. Fuzzy Logic (FL) seems to be the appropriate tool for handling such kind of uncertainty. In this paper, we present a model for trust calculation under the principles of FL. Our scheme takes into consideration the social dimension of trust as well as personal experiences of entities before they decide interactions with an IA. The proposed model is a two-level system involving three FL sub-systems to calculate (a) the social trust (based on experiences retrieved by the community), (b) the individual trust (based on personal experiences) and (c) the final trust. We present our results by evaluating the proposed system compared to other models and reveal its significance.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124836668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonard Tan, Thuan Pham, Hang Kei Ho, Tan Seng Kok
{"title":"Event Prediction in Online Social Networks","authors":"Leonard Tan, Thuan Pham, Hang Kei Ho, Tan Seng Kok","doi":"10.26421/JDI2.1-4","DOIUrl":"https://doi.org/10.26421/JDI2.1-4","url":null,"abstract":"Event prediction is a very important task in numerous applications of interest like fintech, medical, security, etc. However, event prediction is a highly complex task because it is challenging to classify, contains temporally changing themes of discussion and heavy topic drifts. In this research, we present a novel approach which leverages on the RFT framework developed in cite{tan2020discovering}. This study addresses the challenge of accurately representing relational features in observed complex social communication behavior for the event prediction task; which recent graph learning methodologies are struggling with. The concept here, is to firstly learn the turbulent patterns of relational state transitions between actors preceeding an event and then secondly, to evolve these profiles temporally, in the event prediction process. The event prediction model which leverages on the RFT framework discovers, identifies and adaptively ranks relational turbulence as likelihood predictions of event occurrences. Extensive experiments on large-scale social datasets across important indicator tests for validation, show that the RFT framework performs comparably better by more than 10% to HPM cite{amodeo2011hybrid} and other state-of-the-art baselines in event prediction.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117173173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Philipp, Andreas Mladenow, C. Strauss, A. Voelz
{"title":"Revealing Challenges within the Application of Machine Learning Services - A Delphi Study","authors":"R. Philipp, Andreas Mladenow, C. Strauss, A. Voelz","doi":"10.26421/JDI2.1-1","DOIUrl":"https://doi.org/10.26421/JDI2.1-1","url":null,"abstract":"Over the past years, Machine Learning has been applied to an increasing number of problems across numerous industries. However, the steady rise in the application of Machine Learning has not come without challenges since companies often lack the expertise or infrastructure to build their own Machine Learning systems. These challenges led to the emergence of a new paradigm, called Machine Learning as a Service. Scientific literature has mainly analyzed this topic in the context of platform solutions that provide ready-to-use environments for companies. We recently have developed a platform-independent approach and labeled it Machine Learning Services. The aim of the present study is to identify and evaluate challenges and opportunities in the application of Machine Learning Services. To do so, we conducted a Delphi Study with a panel of machine learning experts. The study consisted of three rounds and was structured according to the five steps of the Data Science Lifecycle. A variety of challenges from the areas “Communication”, “Environment”, “Approach”, “Data”, “Retraining, Testing, Monitoring and Updating”, “Model Training and Evaluation” were identified. Subsequently, the challenges revealed by the Delphi Study were compared with previous work on Machine Learning as a Service, which resulted from a structured literature review. The identified areas serve as possible future research fields and give further implications for practice. Alleviating communication issues and assessing the business IT infrastructure prior to the machine learning project are among the key findings of our study.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114184746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Schema-level Index Models for Web Data Search","authors":"A. Scherp, Till Blume","doi":"10.26421/JDI2.1-3","DOIUrl":"https://doi.org/10.26421/JDI2.1-3","url":null,"abstract":"Indexing the Web of Data offers many opportunities, in particular, to find and explore data sources. One major design decision when indexing the Web of Data is to find a suitable index model, i.e., how to index and summarize data. Various efforts have been conducted to develop specific index models for a given task. With each index model designed, implemented, and evaluated independently, it remains difficult to judge whether an approach generalizes well to another task, set of queries, or dataset. In this work, we empirically evaluate six representative index models with unique feature combinations. Among them is a new index model incorporating inferencing over RDFS and texttt{owl:sameAs}. We implement all index models for the first time into a single, stream-based framework. We evaluate variations of the index models considering sub-graphs of size $0$, $1$, and $2$ hops on two large, real-world datasets. We evaluate the quality of the indices regarding the compression ratio, summarization ratio, and F1-score denoting the approximation quality of the stream-based index computation. The experiments reveal huge variations in compression ratio, summarization ratio, and approximation quality for different index models, queries, and datasets. However, we observe meaningful correlations in the results that help to determine the right index model for a given task, type of query, and dataset.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"362 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115930511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Collaborative Filtering based Approach to Classify Movie Genres using User Ratings","authors":"Raji Ghawi, J. Pfeffer","doi":"10.26421/JDI1.4-3","DOIUrl":"https://doi.org/10.26421/JDI1.4-3","url":null,"abstract":"In this paper, we present an approach for classifying movie genres based on user-ratings. Our approach is based on collaborative filtering (CF), a common technique used in recommendation systems, where the similarity between movies based on user-ratings, is used to predict the genres of movies. The results of conducted experiments show that our genres classification approach outperforms many existing approaches, by achieving an F1-score of 0.70, and a hit-rate of 94%. We also construct a multilayer network of movies, with genres as layers. We apply agglomerative clustering on the layers of this network to obtain a comprehensible taxonomy of genres which groups together similar genres using the similarity of their movies in terms of user preferences.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129620003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caio Libânio Melo Jerônimo, L. Marinho, Cclaudio E.C. Carmpelo, Adriano Veloso, A. S. C. Melo
{"title":"Characterization of Fake News Based on Subjectivity Lexicons","authors":"Caio Libânio Melo Jerônimo, L. Marinho, Cclaudio E.C. Carmpelo, Adriano Veloso, A. S. C. Melo","doi":"10.26421/JDI1.4-2","DOIUrl":"https://doi.org/10.26421/JDI1.4-2","url":null,"abstract":"While many works investigate spread patterns of fake news in social networks, we focus on the textual content. Instead of relying on syntactic representations of documents (aka Bag of Words) as many works do, we seek more robust representations that may better differentiate fake from legitimate news. We propose to consider the subjectivity of news under the assumption that the subjectivity levels of legitimate and fake news are significantly different. For computing the subjectivity level of news, we rely on a set subjectivity lexicons for both Brazilian Portuguese and English languages. We then build subjectivity feature vectors for each news article by calculating the Word Mover's Distance (WMD) between the news and these lexicons considering the embedding the news words lie in, in order to analyze and classify the documents. The results demonstrate that our method is robust, especially in scenarios where training and test domains are different.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122611251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clause-level Analysis High-value Reviews based on Sentiment","authors":"Akiyo Nadamoto, Kazuhiro Akiyama, T. Kumamoto","doi":"10.26421/JDI1.4-4","DOIUrl":"https://doi.org/10.26421/JDI1.4-4","url":null,"abstract":"Today, huge numbers of reviews are posted on the internet. Online shoppers often refer to reviews written about the products. A review has a star rating that represents what other people think about the product. However, the star rating is not always appropriate for evaluating the product. High-value reviews that affect the users' willingness to buy are independent of the number of stars in ratings. High-value reviews are those from which people find useful information those regarded as good reviews. As described in this paper, we investigated the relation between high-value reviews and the sentiment (positive/negative/neutral) of their clauses based on four hypotheses. We extract characteristics of high-value reviews based on our results. Furthermore, we propose a classification method that classifies clause level sentiment from reviews.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126826685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Partial Annotation Scheme for Active Learning on Named Entity Recognition Tasks","authors":"Koga Kobayashi, Kei Wakabayashi","doi":"10.26421/JDI1.3-2","DOIUrl":"https://doi.org/10.26421/JDI1.3-2","url":null,"abstract":"Active learning is a promising approach to alleviate the expensive annotation cost for making training data on named entity recognition (NER) tasks. However, since existing active learning methods on NER tasks implicitly assume the full annotation scheme of which the unit of an annotation request is the whole sentence, the efficiency of the data instance selection is limited. In this paper, we propose a new active learning method based on a partial annotation scheme, which selects a part of the sentences to be annotated and asks human annotators to label a specific part of the target sentences. In the experiment, we show that the partial annotation scheme can quickly train the proposed point-wise prediction model compared to the existing active learning methods on NER tasks.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124015680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Independent Game Developers and Their Expectations Towards Recommender Systems","authors":"Marta Kholodylo, C. Strauss","doi":"10.26421/JDI1.3-1","DOIUrl":"https://doi.org/10.26421/JDI1.3-1","url":null,"abstract":"Electronic recommender systems and digital distribution as such have transformed many industries. Digital games is one of those industries where the transformation is particularly evident, as more and more games are appearing on the market, and most of the titles are published by independent game developers. Due to electronic recommender systems developers can now self-publish their content without the mediation of third parties and additional costs for their services. This change has significantly decreased the costs of game production, distribution, and marketing, allowing more studios to engage in releasing their games on their own. However, it is unclear how the developers themselves perceive the effect of electronic recommender systems of their business models. This paper presents a qualitative study on the impact of electronic recommender systems in context of independent game development. Based on semi-structured expert interviews with active game developers who have been engaged in promoting their games through electronic recommender systems, our study provides insights on how independent game developers perceive those systems as part of their value chain and their business model. The results of the study concern to (i) independent game developers to establish, adapt, review, or improve their business model, and (ii) providers and developers of electronic recommender systems as indication of needs and requirements as well as expectations of their potential content creators.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127325070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}