{"title":"Crowdsourcing worker development based on probabilistic task network","authors":"Masayuki Ashikawa, Takahiro Kawamura, Akihiko Ohsuga","doi":"10.1145/3106426.3106501","DOIUrl":"https://doi.org/10.1145/3106426.3106501","url":null,"abstract":"Crowdsourcing platforms provide an attractive solution for processing numerous tasks at low cost. However, insufficient quality control remains a major concern. In the present study, we propose a grade-based training method for workers. Our training method utilizes probabilistic networks to estimate correlations between tasks based on workers' records for 18.5 million tasks and then allocates pre-learning tasks to the workers to raise the accuracy of target tasks according to the task correlations. In an experiment, the method automatically allocated 31 pre-learning task categories for 9 target task categories, and after the training of the pre-learning tasks, we confirmed that the accuracy of the target tasks was raised by 7.8 points on average. We thus confirmed that the task correlations can be estimated using a large amount of worker records, and that these are useful for the grade-based training of low-quality workers.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90861267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim
{"title":"Large-scale taxonomy induction using entity and word embeddings","authors":"Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim","doi":"10.1145/3106426.3106465","DOIUrl":"https://doi.org/10.1145/3106426.3106465","url":null,"abstract":"Taxonomies are an important ingredient of knowledge organization, and serve as a backbone for more sophisticated knowledge representations in intelligent systems, such as formal ontologies. However, building taxonomies manually is a costly endeavor, and hence, automatic methods for taxonomy induction are a good alternative to build large-scale taxonomies. In this paper, we propose TIEmb, an approach for automatic unsupervised class subsumption axiom extraction from knowledge bases using entity and text embeddings. We apply the approach on the WebIsA database, a database of subsumption relations extracted from the large portion of the World Wide Web, to extract class hierarchies in the Person and Place domain.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88200365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Affective prediction by collaborative chains in movie recommendation","authors":"Yong Zheng","doi":"10.1145/3106426.3106535","DOIUrl":"https://doi.org/10.1145/3106426.3106535","url":null,"abstract":"Recommender systems have been successfully applied to alleviate the information overload and assist user's decision makings. Emotional states have been demonstrated as effective factors in recommender systems. However, how to collect or predict a user's emotional state becomes one of the challenges to build affective recommender systems. In this paper, we explore and compare different solutions to predict emotions to be applied in the recommendation process. More specifically, we propose an approach named as collaborative chains. It predicts emotional states in a collaborative way and additionally takes correlations among emotions into consideration. Our experimental results based on a movie rating data demonstrate the effectiveness of affective prediction by collaborative chains in movie recommendations.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85703981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LCHI: multiple, overlapping local communities","authors":"Moeen Farasat, J. Scripps","doi":"10.1145/3106426.3106438","DOIUrl":"https://doi.org/10.1145/3106426.3106438","url":null,"abstract":"Local community finding algorithms are helpful for finding communities around a seed node especially when the network is large and a global method is too slow. Most local methods find only a single community or are required to be run several times over different seed nodes to create multiple communities. In this paper, we present a new algorithm, LCHI that finds multiple, overlapping communities around a single node. Examples and analyses are presented support the effectiveness of LCHI.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81726953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context suggestion: empirical evaluations vs user studies","authors":"Yong Zheng","doi":"10.1145/3106426.3106466","DOIUrl":"https://doi.org/10.1145/3106426.3106466","url":null,"abstract":"Recommender System has been successfully applied to assist user's decision making by providing a list of recommended items. Context-aware recommender system additionally incorporates contexts (such as time and location) into the system to improve the recommendation performance. The development of context-aware recommender systems brings a new opportunity - context suggestion which refers to the task of recommending appropriate contexts to the users to improve user experience. In this paper, we explore the question whether user's contextual ratings can be reused to produce context suggestions. We propose two evaluation mechanisms for context suggestion, and empirically compare direct context predictions and indirect context suggestions based on a movie data that was collected from user studies. The experimental results reveal that indirect context suggestion works better than the direct context prediction, and tensor factorization is the best approach to produce context suggestions in our movie data.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"19-20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82718068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large-scale readability analysis of privacy policies","authors":"Benjamin Fabian, Tatiana Ermakova, Tino Lentz","doi":"10.1145/3106426.3106427","DOIUrl":"https://doi.org/10.1145/3106426.3106427","url":null,"abstract":"Online privacy policies notify users of a Website how their personal information is collected, processed and stored. Against the background of rising privacy concerns, privacy policies seem to represent an influential instrument for increasing customer trust and loyalty. However, in practice, consumers seem to actually read privacy policies only in rare cases, possibly reflecting the common assumption stating that policies are hard to comprehend. By designing and implementing an automated extraction and readability analysis toolset that embodies a diversity of established readability measures, we present the first large-scale study that provides current empirical evidence on the readability of nearly 50,000 privacy policies of popular English-speaking Websites. The results empirically confirm that on average, current privacy policies are still hard to read. Furthermore, this study presents new theoretical insights for readability research, in particular, to what extent practical readability measures are correlated. Specifically, it shows the redundancy of several well-established readability metrics such as SMOG, RIX, LIX, GFI, FKG, ARI, and FRES, thus easing future choice making processes and comparisons between readability studies, as well as calling for research towards a readability measures framework. Moreover, a more sophisticated privacy policy extractor and analyzer as well as a solid policy text corpus for further research are provided.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82807438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Entity oriented action recommendations for actionable knowledge graph generation","authors":"Md. Mostafizur Rahman, A. Takasu","doi":"10.1145/3106426.3106546","DOIUrl":"https://doi.org/10.1145/3106426.3106546","url":null,"abstract":"Popular search engines have recently utilized the power of knowledge graphs (KGs) to provide specific answers to queries in a direct way. Search engine result pages (SERPs) are expected to provide facts in response to queries that satisfy semantic meaning. This encourages researchers to propose more influential knowledge graph generation techniques. To achieve and advance the technologies related to actionable knowledge graph presentation, creating action recommendations (ARs) is an essential step and a relatively new research direction to nurture research on generating KGs that are optimized for facilitating an entity's actions. An action represents the physical or mental activity of an entity. For example, for the entity \"Donald J. Trump\", typical potential actions could be \"won the US presidential election\" or \"targets US journalists\". In this paper, we describe the generation of relevant action recommendations based on entity instance and entity type. We propose two models that employ different approaches. Our first model exploits semisupervised learning and we introduce entity context vector (ECV) as an entity's distinguishing features for capturing the context of entities to reveal the similarity between entities, grounded on the prominent word2vec model. The second model is a probabilistic approach based on the Naive Bayes Theorem. We extensively evaluate our proposed models. Our first model significantly outperforms probabilistic and supervised learning-based models.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80982480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-shot human activity recognition via nonlinear compatibility based method","authors":"Wei Wang, C. Miao, Shuji Hao","doi":"10.1145/3106426.3106526","DOIUrl":"https://doi.org/10.1145/3106426.3106526","url":null,"abstract":"Human activity recognition aims to recognize human activities from sensor readings. Most of existing methods in this area can only recognize activities contained in training dataset. However, in practical applications, previously unseen activities are often encountered. In this paper, we propose a new zero-shot learning method to solve the problem of recognizing previously unseen activities. The proposed method learns a nonlinear compatibility function between feature space instances and semantic space prototypes. With this function, testing instances are classified to unseen activities with highest compatibility scores. To evaluate the effectiveness of the proposed method, we conduct extensive experiments on three public datasets. Experimental results show that our proposed method consistently outperforms state-of-the-art methods in human activity recognition problems.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78756399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CEDAL: time-efficient detection of erroneous links in large-scale link repositories","authors":"André Valdestilhas, Tommaso Soru, A. N. Ngomo","doi":"10.1145/3106426.3106497","DOIUrl":"https://doi.org/10.1145/3106426.3106497","url":null,"abstract":"More than 500 million facts on the Linked Data Web are statements across knowledge bases. These links are of crucial importance for the Linked Data Web as they make a large number of tasks possible, including cross-ontology, question answering and federated queries. However, a large number of these links are erroneous and can thus lead to these applications producing absurd results. We present a time-efficient and complete approach for the detection of erroneous links for properties that are transitive. To this end, we make use of the semantics of URIs on the Data Web and combine it with an efficient graph partitioning algorithm. We then apply our algorithm to the LinkLion repository and show that we can analyze 19,200,114 links in 4.6 minutes. Our results show that at least 13% of the owl :sameAs links we considered are erroneous. In addition, our analysis of the provenance of links allows discovering agents and knowledge bases that commonly display poor linking. Our algorithm can be easily executed in parallel and on a GPU. We show that these implementations are up to two orders of magnitude faster than classical reasoners and a non-parallel implementation.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76784114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Iitsuka, Kazuya Kawakami, S. Hagiwara, T. Kawakami, Takayuki Hamada, Y. Matsuo
{"title":"Inferring win-lose product network from user behavior","authors":"S. Iitsuka, Kazuya Kawakami, S. Hagiwara, T. Kawakami, Takayuki Hamada, Y. Matsuo","doi":"10.1145/3106426.3106502","DOIUrl":"https://doi.org/10.1145/3106426.3106502","url":null,"abstract":"Various data mining techniques to extract product relations have been examined, especially in the context of building intelligent recommender systems. Most such techniques, however, specifically examine co-occurrences of browsed or purchased products on e-commerce websites, which provide little or no useful information related to the direct relation of superiority or the factor which forms that superiority. For marketers and product managers, understanding the competitive advantages of a given product is important to consolidate their product differentiation strategies. As described in this paper, we propose a win-lose relation, a new product relation analysis method that retrieves the superiority relation between competitive products in terms of product attractiveness. Our proposed method uses the difference between user browsing and purchasing behaviors, assuming that a purchased product is superior to products that are browsed but not purchased. We also propose superiority factor analysis to examine keywords that represent the superiority factor by mining product reviews. We evaluate our methods using an actual dataset from Zexy, the largest wedding portal website in Japan. Our experimental evaluation revealed that our proposed method can estimate actual user preferences observed from a user study using only log data. Results also show that our proposed method raises the accuracy of superiority factor extraction by around 17% by considering the win-lose relation of products.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76943515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}