2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)最新文献_第5页

Visited Websites May Reveal Users’ Demographic Information and Personality 访问过的网站可能会泄露用户的人口统计信息和个性

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352525

Cheng-You Lien, Guo-Jhen Bai, Hung-Hsuan Chen

{"title":"Visited Websites May Reveal Users’ Demographic Information and Personality","authors":"Cheng-You Lien, Guo-Jhen Bai, Hung-Hsuan Chen","doi":"10.1145/3350546.3352525","DOIUrl":"https://doi.org/10.1145/3350546.3352525","url":null,"abstract":"This study shows that simple supervised learning algorithms can easily predict a user’s personality and demographic information based on the features derived from the users’ browsing logs, even when the logs are not recorded with the finest granularity (i.e., each visited URL of a user). This is different from the analytical formula of Cambridge Analytica (CA), which reported that it needs to know each user’s detailed liked objects (e.g., articles, pages, etc.) on Facebook with a fine granularity (i.e., CA needs to know the liked articles, not only the types of the articles) to predict user information. However, we employed only the visited website categories to predict a user’s gender, age, relationship status, and big six personality scores, which is an authoritative index to represent an individual’s personality in six dimensions. We also show that applying simple clustering as a preprocessing step enhances the predictive power. As a result, the data collectors, even when storing only a coarse granularity of the visited URLs of the users, may leverage such information to identify a user’s preferences/tastes and her/his private information without notifying users.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123868736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Cenote: A Big Data Management and Analytics Infrastructure for theWeb of Things Cenote:面向物联网的大数据管理和分析基础设施

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352531

Kyriakos C. Chatzidimitriou, Michail D. Papamichail, Napoleon-Christos I. Oikonomou, Dimitrios Lampoudis, A. Symeonidis

引用次数: 10

SemKeyphrase: An Unsupervised Approach to Keyphrase Extraction from MOOC Video Lectures SemKeyphrase:一种从MOOC视频讲座中提取关键词的无监督方法

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352535

A. Albahr, D. Che, M. Albahar

引用次数: 5

Deep Dynamic Mixed Membership Stochastic Blockmodel 深度动态混合隶属度随机块模型

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352511

Zheng Yu, M. Pietrasik, M. Reformat

{"title":"Deep Dynamic Mixed Membership Stochastic Blockmodel","authors":"Zheng Yu, M. Pietrasik, M. Reformat","doi":"10.1145/3350546.3352511","DOIUrl":"https://doi.org/10.1145/3350546.3352511","url":null,"abstract":"Latent community models are successful at statistically modeling network data by assigning network entities to communities and modelling entity relations as the relations of their communities. In this paper, we describe the limitation of these models in inferring relations between two communities when the entity relations between these communities are unobserved. We propose a solution to this problem by factorizing the community relations matrix into two community feature matrices, thereby adding a dependency between community relations. We introduce the deep dynamic mixed membership stochastic blockmodel based network (DDBN) to demonstrate the feasibility of such an approach. Our model marries the mixed membership stochastic blockmodel (MMSB) with deep neural networks for rich feature extraction and introduces a temporal dependency in latent features using a long short-term memory unit for dynamic network modeling. We evaluate our model on the link prediction task in static and dynamic networks and find that our model achieves comparable results with state-of-the-art methods.CCS CONCEPTS• Computing methodologies → Neural networks; Learning in probabilistic graphical models; Factorization methods.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116393345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

FIF: A NLP-based Feature Identification Framework for Data Warehouses 基于nlp的数据仓库特征识别框架

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352530

A. Prabhune, Ashish Chouhan

引用次数: 1

Which machine learning paradigm for fake news detection? 哪种机器学习范式用于假新闻检测?

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352552

Dimitrios Katsaros, G. Stavropoulos, Dimitrios Papakostas

引用次数: 39

Linear Scheduling of Big Data Streams on Multiprocessor Sets in the Cloud 云中多处理器集上大数据流的线性调度

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352507

Nicoleta Tantalaki, S. Souravlas, M. Roumeliotis, S. Katsavounis

引用次数: 11

Multi-parameter streaming outlier detection 多参数流异常点检测

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352520

Theodoros Toliopoulos, A. Gounaris

{"title":"Multi-parameter streaming outlier detection","authors":"Theodoros Toliopoulos, A. Gounaris","doi":"10.1145/3350546.3352520","DOIUrl":"https://doi.org/10.1145/3350546.3352520","url":null,"abstract":"Distance-based outlier detection techniques is a wide-spread methodology for anomaly detection. Despite their effectiveness, a main limitation is that they heavily rely on the dataset and the parameters chosen in order to establish the right status of each data point. These parameters typically include, but are not limited to, the neighborhood radius and threshold. In continuous streaming environments, the need for real-time analysis does not permit for an algorithm to be restarted multiple times with different parameters until the right combination is specified. This gives rise to the need for one technique that combines an arbitrary number of parameterizations with the use of minimal yet sufficient computer resources. In this work we both compare the state-of-the-art techniques for handling multiple queries in distance-based outlier detection algorithms and we propose a novel technique for multi-parameter distance-based outlier detection tailored to distributed continuous streaming environments, such as Spark and Flink. CCS CONCEPTS • Information systems $rightarrow$Data stream mining;• Computing methodologies$rightarrow$Anomaly detection; Massively parallel algorithms.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133525559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

EpiRep: Learning Node Representations through Epidemic Dynamics on Networks epeprep:通过网络流行动力学学习节点表示

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3360738

B. Shi, Jianan Zhong, Qing Bao, Hongjun Qiu, Jiming Liu

{"title":"EpiRep: Learning Node Representations through Epidemic Dynamics on Networks","authors":"B. Shi, Jianan Zhong, Qing Bao, Hongjun Qiu, Jiming Liu","doi":"10.1145/3350546.3360738","DOIUrl":"https://doi.org/10.1145/3350546.3360738","url":null,"abstract":"Understanding the dynamic properties of epidemic spreading on complex social networks is essential to make effective and efficient public health policies for epidemic prevention and control. In recent years, the concept of network embedding has attracted lots of attention to deal with various network analytic tasks, the purpose of which is to encode relationships or information of networked elements into a low-dimensional vector space. However, most existing embedding methods have focused mainly on preserving static network information, such as structural proximity, node/edge attributes, and labels. On the contrary, in this paper, we focus on the embedding problem of preserving dynamic characteristics of epidemic spreading on social networks. We propose a novel embedding method, namely EpiRep, to learn node representations of a network by maximizing the likelihood of preserving groups of infected nodes due to the epidemics starting from every single node on the network. Specifically, the Susceptible-Infectious model is adopted to simulate the epidemic dynamics on networks, and the Continuous Bag-of-Words model with negative sampling is used to obtain node representations. Experimental results show that the EpiRep method outperforms two benchmark random-walk based embedding methods in terms of node clustering and classification on several synthetic and real-world networks. The proposed method and findings in this paper may offer new insight for source identification and infection prevention in the face of epidemic spreading on social networks.CCS CONCEPTS • Computer systems organization → Embedded systems; Redundancy; Robotics; • Networks → Network reliability.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131926958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Machine Learning Based Web-Traffic Analysis for Detection of Fraudulent Resource Consumption Attack in Cloud 基于机器学习的网络流量分析在云环境中检测欺诈性资源消耗攻击

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI: 10.1145/3350546.3352567

Rishabh Rustogi, Abhishek Agarwal, Ayush Prasad, S. Saurabh

{"title":"Machine Learning Based Web-Traffic Analysis for Detection of Fraudulent Resource Consumption Attack in Cloud","authors":"Rishabh Rustogi, Abhishek Agarwal, Ayush Prasad, S. Saurabh","doi":"10.1145/3350546.3352567","DOIUrl":"https://doi.org/10.1145/3350546.3352567","url":null,"abstract":"Attackers can orchestrate a fraudulent resource consumption (FRC) attack by wittingly consuming metered resources of the cloud servers for a long duration of time. The skillful over-consumption of the resources results in significant financial burden to the client. These attacks differ in intent but not in content, hence they are hard to detect. In this paper, we propose a novel scheme for the detection of the FRC attack on a cloud based web-server. We first divide the web-pages into a number of quantiles based on their popularity index. Next, we compute the number of requests per hour for each of these quantiles. Discrete Wavelet Transform is then applied to these quantiles to remove any high-frequency anomaly and smoothen the time series data. The n-tuple data from these quantiles along with their label (attack or normal) is used to train an Artificial Neural Network model. Our trained model for low percent of FRC attack (5%) obtained an accuracy of 98.51% with a precision of 0.983 and recall of 0.987 in detecting the FRC attack. CCS CONCEPTS • Security and privacy → Intrusion/anomaly detection and malware mitigation; → Computing methodologies → Supervised learning by classification.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131460242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4