{"title":"A Novel Approach for Email Clustering Based on Semantics","authors":"Binlai He, Zefeng Li, Nan Yang","doi":"10.1109/WISA.2014.56","DOIUrl":"https://doi.org/10.1109/WISA.2014.56","url":null,"abstract":"An increasing interest has been recently devoted to clustering short documents. Short documents don't contain enough text to compute similarities accurately by implementing the most widely used technique called Vector Space Model (VSM). Adding semantics to short documents clustering is one efficient way to solve this problem. However, real life collections are often composed of very short or long documents. For example, the length of email messages for each email user follows a power-law distribution. Long emails and short emails both appear in email corpus. Therefore, both state-of-the-art short documents and long document clustering approaches can't get a high cluster quality or high efficiency in short and long documents clustering. In order to solve this problem, we propose a novel approach for email clustering based on semantics. Empirical validation shows that our method can obtain high cluster quality and high efficiency in real world email datasets.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124988988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Replication Placement Strategy Based On Bidding Mode for Cloud Storage Cluster","authors":"Hong Zhang, Bing Lin, Zhanghui Liu, Wenzhong Guo","doi":"10.1109/WISA.2014.45","DOIUrl":"https://doi.org/10.1109/WISA.2014.45","url":null,"abstract":"The data availability in large-scale cloud storage has been increasing by means of data replica. To provide cost-effective availability, minimize the response time of applications and make load balancing for cloud storage, a new replica placement policy with bidding thought is proposed. The policy combines the own characteristics of replica and factors of bidding mode(e.g. bidding time, bidding standard, bidding price etc.) and starts replica bidding activity when the file data availability cannot meet the given requirement. Replica placement is based on capacity and accessing probability of data nodes. The experimental results show that our policy has a better performance in both load balance and response time comparing to the static replica policy and CDRM scheme.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123528062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Collaborative Filtering Algorithm of Selecting Neighbors for Each Target Item","authors":"Yaqiong Guo, Mengxing Huang, Longfei Sun","doi":"10.1109/WISA.2014.33","DOIUrl":"https://doi.org/10.1109/WISA.2014.33","url":null,"abstract":"Traditional User-based collaborative filtering recommendation algorithm in the calculation of similarity between users only considers the users' score to the item, but not takes the difference of rated items into account. Aiming at the shortcomings of the traditional method, with the practical application of recommendation system, a new collaborative filtering recommendation algorithm is proposed which selects neighbors for each target item. Ratings based on item type determine preliminary neighbors from the users, for each target item computing neighbors of the target user, and in the case of not rating the target item, the expanded neighbors are considered, finally predicting and recommending target items. The experimental results show that the algorithm improves the accuracy of similarity calculation and the error performance when comparing with other classic algorithms, and effectively alleviates the user rating data sparsity problem, while improving the accuracy of the forecast.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123546741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Method for Link Prediction Using Various Features in Social Networks","authors":"Yu Zhang, Kening Gao, Feng Li, Ge Yu","doi":"10.1109/WISA.2014.34","DOIUrl":"https://doi.org/10.1109/WISA.2014.34","url":null,"abstract":"Link prediction is a basic problem in the research of social networks. At present, most link prediction algorithms are based on the features extracted from network structure, few research concerns the effect of natural attributes of nodes for creating a link. In this paper we develop a novel way to predict links based on Random Walk algorithm using the information from both the network topology and rich node attributes. The experiment result show that our method can help improves the prediction accuracy and it proves that node attributes have a real effect on link creation.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131298989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Social Circle Prediction Based on Bayesian Network","authors":"Yan Tang, Lili Lin, Zhuoming Xu, Yu Wang","doi":"10.1109/WISA.2014.32","DOIUrl":"https://doi.org/10.1109/WISA.2014.32","url":null,"abstract":"User's personal social networks are big and cluttered, yet contain highly valuable information. Organizing users' friends into circles or communities is a fundamental task in social network research. Social network sites allow users to manually categorize their friends into social circles, however this process is laborious and inadaptable to changes. In this paper, we study novel ways of automatically determining users' social circles. We treat this task as a classification problem on a user's ego-network, a network of connections between friends. Based on Bayesian Network (BN), we develop a model for determining whether a query user Uq is in main user Um's social circle. First, we transform the original social network data to make it suitable for BN modeling, and build an Initial Bayesian Network (IBN) of Um using the state-of-the-art BN learning algorithm. Then, we propose a new method to improve the IBN by adding important parents to the class variable. Lastly, leveraging carefully designed threshold, we use the final BN to determine the existence of Uq in the social circle of Um. Modeling social circle with BN allows us to quantify user's social circle existence with probability and run query with missing values/evidences. Using ground-truth data from Facebook and Twitter, experimental results indicate that our BN model could accurately determine user's existence in social circle and outperforms four baseline predictors, namely Naïve Bayes, IBL, OneR and J48, showing promising application potential in the social circle research area.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134377975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a Media Fragment URI Aware User Agent","authors":"Ting Wu, Zhuoming Xu, Lixian Ni, Yuanhang Zhuang, Junhua Wang, Qin Yan","doi":"10.1109/WISA.2014.15","DOIUrl":"https://doi.org/10.1109/WISA.2014.15","url":null,"abstract":"The W3C's Media Fragments URI 1.0 specification provides for a media-format independent, standard means of addressing media fragments on the Web using Uniform Resource Identifiers (URIs). Thus, a key requirement is for the User Agent (UA) to efficiently retrieve media fragments identified by URIs from the regular media server over the HTTP protocol. This paper addresses the issue of how to construct a Media Fragment URI aware UA. We propose an approach for achieving such a UA, focusing on fully indexable container formats. Our approach consists of a set of algorithms capable of performing URI-based media fragment retrieval. Algorithm implementation and experimental results show that our approach is achievable and is able to greatly reduce time and bandwidth costs compared to the traditional approach of downloading the entire media resource.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"29 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122133606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Zhang, Xue Zhao, Chao Wang, Ya Wang, Lili Su, Xiaojie Yuan
{"title":"Social Advertisability Analysis on Twitter","authors":"Ying Zhang, Xue Zhao, Chao Wang, Ya Wang, Lili Su, Xiaojie Yuan","doi":"10.1109/WISA.2014.30","DOIUrl":"https://doi.org/10.1109/WISA.2014.30","url":null,"abstract":"Twitter presents a nice opportunity for targeting advertisements that are contextually related to Twitter content. By virtue of the sparse and noisy text makes identifying the tweets for advertising a very hard problem. In this paper, we propose a novel and effective scheme to identify the tweets that can be targeted for advertisements. We firstly construct a multi-source corpus to collect more auxiliary information for advertisability analysis. We then build the LDA-based topic models to obtain the document-word distributions. We extract features according to these distributions and select contributing ones. Finally we train a logistic regression classifier to discriminate the advertisable tweets from unadvertisable ones. Extensive experiments on a representative real-word Twitter dataset demonstrate that our scheme can identify advertisable tweets effectively.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127176679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DZMQ: A Decentralized Distributed Messaging System for Realtime Web Applications and Services","authors":"Fei Yang, Xiaojun Ye, Yong Zhang, Chunxiao Xing","doi":"10.1109/WISA.2014.38","DOIUrl":"https://doi.org/10.1109/WISA.2014.38","url":null,"abstract":"Message-oriented middleware especially for message queue has been widely used in web applications and services. Performance and scalability are quite essential in these systems however they often become the bottleneck. Existing message queues are not able to scale out elastically very well. This paper presents a decentralized distributed architecture based on peer to peer model, in which we always deliver messages with zero or one hop and take advantage of zero-copy. We implemented a scaling algorithm that can be adapted to the dynamic scale of requests and make the system scale out elastically. A series of workload tests have proved that our system can have low response latency and achieve linear increasing throughput. With these desired properties, the message system can be used to develop large scale web applications and services and provide high-performance services to users.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130111891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Driver and Riders Matching Approach","authors":"Jamal Yousaf, Juan-Zi Li","doi":"10.1109/WISA.2014.18","DOIUrl":"https://doi.org/10.1109/WISA.2014.18","url":null,"abstract":"Dependence on personal automobiles is becoming increasingly costly due to accelerating climate change and rising gasoline prices. It is particularly wasteful when one realizes that most car seats are typically empty. With the advancement of mobile social networking technologies, it is necessary to reconsider the principles and desired characteristics of ride-sharing systems. Ride-sharing systems can be popular among people if we can provide more flexible and adaptive solution according to preferences of the participants and solve the social challenges. In this paper, we present the genetic algorithm for solving the riders and drivers matching problem with different conflicting objectives. The experiment results of the proposed algorithm indicates the superior performance over the generalized label correcting algorithm in terms of quality and runtime.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121593760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ontology Modeling for Mobile Software","authors":"Zhiyun Zhu, Yanhui Li, Baowen Xu","doi":"10.1109/WISA.2014.58","DOIUrl":"https://doi.org/10.1109/WISA.2014.58","url":null,"abstract":"With the rapid development of smart phones and mobile Internet, the number of mobile software has largely increased, and online mobile software markets have also emerged, providing mobile software introduction and download for users. However, for online resources are distributed and heterogeneous, different online markets may give different names, incomplete information or incorrect data about the same softwares, which causes troubles in management and maintenance of mobile software. In this paper, to solve inconsistency and heterogeneity of software information, we collect raw software data from popular online mobile software markets, unify its format by some pre-processing methods, utilize semantic technology to discover information errors and correct them automatically, and build mobile software ontology model to help manage the mobile phone software.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"45 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116902839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}