{"title":"NMFDIV: A Nonnegative Matrix Factorization Approach for Search Result Diversification on Attributed Networks","authors":"Zaiqiao Meng, Hong Shen","doi":"10.1109/PDCAT.2017.00023","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00023","url":null,"abstract":"Search result diversification is effective way to tackle query ambiguity and enhance result novelty. In the context of large information networks, diversifying search result is also critical for further design of applications such as link prediction and citation recommendation. In previous work, this problem has mainly been tackled in a way of implicit query intent. To further enhance the performance, we propose an explicit search result diversification method that explicitly encode query intent and represent nodes as representation vectors by a novel nonnegative matrix factorization approach, and the diversity of the results node account for the query relevance and the novelty w.r.t. these vectors. To learn representation vectors for networks, we derive the multiplicative update rules to train the nonnegative matrix factorization model. Finally, we perform a comprehensive evaluation on our proposals with various baselines. Experimental results show the effectiveness of our proposed solution, and verify that attributes do help improve diversification performance.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114436530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Privacy-Preserving Cloud-Based Data Management System with Efficient Revocation Scheme","authors":"S. Chang, Ja-Ling Wu","doi":"10.1109/PDCAT.2017.00011","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00011","url":null,"abstract":"There are lots of data management systems, according to various reasons, designating their high computational work-loads to public cloud service providers. It is well-known that once we entrust our tasks to a cloud server, we may face several threats, such as privacy-infringement with regard to users attribute information; therefore, an appropriate privacy preserving mechanism is a must for constructing a secure cloud-based data management system (SCBDMS). To design a reliable SCBDMS with server-enforced revocation ability is a very challenging task even if the server is working under the honest-but-curious mode. In existing data management systems, there seldom provide privacy-preserving revocation service, especially when it is outsourced to a third party. In this work, with the aids of oblivious transfer and the newly proposed stateless lazy re-encryption (SLREN) mechanism, a SCBDMS, with secure, reliable and efficient server-enforced attribute revocation ability is built. Comparing with related works, our experimental results show that, in the newly constructed SCBDMS, the storage-requirement of the cloud server and the communication overheads between cloud server and systems users are largely reduced, due to the nature of late involvement of SLREN.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115101448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vivek Sourabh, Parth Pahariya, Isha Agarwal, Ankit Gautam, C. R. Chowdary
{"title":"Parallel Implementation of Dynamic Programming Problems Using Wavefront and Rank Convergence with Full Resource Utilization","authors":"Vivek Sourabh, Parth Pahariya, Isha Agarwal, Ankit Gautam, C. R. Chowdary","doi":"10.1109/PDCAT.2017.00033","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00033","url":null,"abstract":"In this paper, we propose a novel approach which uses full processor utilization to compute a particular class of dynamic programming problems parallelly. This class includes algorithms such as Longest Common Subsequence and Needleman-Wunsch. In a dynamic programming, a larger problem is divided into smaller problems which are then solved, and the results are used to compute the final result. Each subproblem can be considered as a stage. If computations made in a stage are independent of the computations made in other stages, then these stages can be calculated in parallel. The idling of processors bottlenecks the performance of the currently existing parallel algorithms. In this paper, we are using rank convergence for computation of each stage ensuring full processor utilization. This increases the efficiency and speedup of the parallel algorithm.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117052451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Strike the Balance between System Utilization and Data Locality under Deadline Constraint for MapReduce Clusters","authors":"Yeh-Cheng Chen, J. Chou","doi":"10.1109/PDCAT.2017.00061","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00061","url":null,"abstract":"MapReduce paradigm has become a popular platform for massive data processing and Big Data applications. Although MapReduce was initially designed for high throughput and batch processing, it has also been used for handling many other types of applications and workloads due to its scalable and reliable system architecture. One of the emerging requirements for enterprise data-process computing is completion time guar- antee. However, there are only a few research works have been done for MapReduce jobs with deadline constraint. Therefore, in this paper, we aim to prevent jobs from missing deadline while maximizing the resource utilization and data locality of a MapReduce cluster. Our approach is to introduce a two-phase job scheduling mechanism which combines a job admission controller policy and a priority-based scheduling algorithm. We use a series of simulations over diverted workload to evaluate our system. The results show that our approach can guarantee job completion time in a heavy-loaded system, and achieve comparable data locality to the delay schedule algorithm in a light-loaded system. Furthermore, our approach can maximize system throughput by preventing system resources from being wasted by the jobs missing their deadlines.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116347472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Handling Churn in Similarity Based Clustering Overlays Using Weighted Benefit","authors":"I. Bukhari, A. Harwood, S. Karunasekera","doi":"10.1109/PDCAT.2017.00069","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00069","url":null,"abstract":"Similarity based clustering (SBC) overlays are decentralized networks of nodes on the Internet edge, where each node maintains some number of direct connections to other nodes that are most \"similar\" to it. The challenge is: how do the nodes in the overlay converge to and maintain the most similar neighbors, given that the network is decentralized, is subject to churn and that similarity varies over time. Protocols that simultaneously provide fast convergence and low bandwidth consumption are the objective of this research. We present a protocol, that we call Weighted Benefit Scheme (WBS), that improves upon existing state-of-the-art in this area: it has equivalent convergence rate to the Optimum Benefit Protocol while simultaneously handling churn competitively to the Vicinity protocol. We use real world datasets from Yahoo WebScope that comprises of 15,400 users with 354,000 ratings about 1000 songs and our experiments are performed on the simulation test-bed PeerNet.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128408142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manu Agrawal, Kartik Manchanda, Ribhav Soni, A. Lal, C. R. Chowdary
{"title":"Parallel Implementation of Local Similarity Search for Unstructured Text Using Prefix Filtering","authors":"Manu Agrawal, Kartik Manchanda, Ribhav Soni, A. Lal, C. R. Chowdary","doi":"10.1109/PDCAT.2017.00025","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00025","url":null,"abstract":"Identifying partially duplicated text segments among documents is an important research problem with applications in plagiarism detection and near-duplicate web page detection. We investigate the problem of local similarity search for finding partially replicated text, focusing on its parallel implementation. Our aim is to find text windows that are approximately similar in two documents, using a filter verification framework. We present various parallel approaches to the problem, of which input data partitioning along with the reduction of individual index maps was found to be most suitable. We analyzed the effect of varying similarity threshold and number of processes on speedup, and also performed cost analysis. Experimental results show that the proposed method achieves up to 13x speedup on a 24-core processor.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129147295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SMiPE: Estimating the Progress of Recurring Iterative Distributed Dataflows","authors":"Jannis Koch, L. Thamsen, Florian Schmidt, O. Kao","doi":"10.1109/PDCAT.2017.00034","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00034","url":null,"abstract":"Distributed dataflow systems such as Apache Spark allow the execution of iterative programs at large scale on clusters. In production use, programs are often recurring and have strict latency requirements. Yet, choosing appropriate resource allocations is difficult as runtimes are dependent on hard-to-predict factors, including failures, cluster utilization and dataset characteristics. Offline runtime prediction helps to estimate resource requirements, but cannot take into account inherent variance due to, for example, changing cluster states. We present SMiPE, a system estimating the progress of iterative dataflows by matching a running job to previous executions based on similarity, capturing properties such as convergence, hardware utilization and runtime. SMiPE is not limited to a specific framework due to its black-box approach and is able to adapt to changing cluster states reflected in the current job’s statistics. SMiPE automatically adapts its similarity matching to algorithm-specific profiles by training parameters on the job history. We evaluated SMiPE with three iterative Spark jobs and nine datasets. The results show that SMiPE is effective in choosing useful historic runs and predicts runtimes with a mean relative error of 9.1% to 13.1%.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130800643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of User Preferences Oriented Service Selection and Deployment in Multi-Cloud Environment","authors":"Letian Yang, Li Liu, Qi Fan","doi":"10.1109/PDCAT.2017.00065","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00065","url":null,"abstract":"Service selection based on users preference and service deployment are challenge due to the diversity of user demands and preferences in the multi-cloud environment. Few works have clearly reviewed the existing works for the users preference-oriented service selection and service deployment in the multi-cloud environment. In this paper, we propose and motivate taxonomies for users preference-oriented service selection and deployment. We present a detailed survey of the state of the art in terms of the description and analysis of user preference, optimization objectives and constraints. Finally, we analyze the existing works and discuss future work in this area of multi-Cloud service selection and deployment based users preference.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116614257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Computing of Optimized Clustering Threshold Values Based on Quasi-Classes Space for the Merchandise Recommendation","authors":"Mingshan Xie, Yanfang Deng, Yong Bai, Mengxing Huang, Wenbo Jiang, Zhuhua Hu","doi":"10.1109/PDCAT.2017.00043","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00043","url":null,"abstract":"The merchandise recommendation is an important part of electronic commerce. In view of the difficulty in obtaining user private information and modeling user interest, this paper is based on the relationship between goods for commodity recommendation. We use fuzzy clustering learning to construct quasi-classes space. Through the intersection of quasi-class and the collection of goods that are being ordered by users, we can know the customers appetites for merchandise, and then recommend the goods. In the construction of quasi-classes space, the value of the threshold Λ must be appropriate, because the threshold Λ determines the size of the quasi-class. It will affect the recommendation of the goods that the size of the quasi-class is too large or too small. The influence of threshold Λ on commodity recommendation is discussed by numerical example, and we finally find the best value of Λ in this paper.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121686001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tzu-Chi Huang, Kuo-Chih Chu, Guo-Hao Huang, Yan-Chen Shen, C. Shieh
{"title":"Computation Capability Deduction Architecture for MapReduce on Cloud Computing","authors":"Tzu-Chi Huang, Kuo-Chih Chu, Guo-Hao Huang, Yan-Chen Shen, C. Shieh","doi":"10.1109/PDCAT.2017.00067","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00067","url":null,"abstract":"MapReduce gradually becomes the de facto programming standard of applications on cloud computing. However, MapReduce needs a cloud administrator to manually configure parameters of the run-time system such as slot numbers for Map and Reduce tasks in order to get the best performance. Because the manual configuration has a risk of performance degradation, MapReduce should utilize the Computation Capability Deduction Architecture (CCDA) proposed in this paper to avoid the risk. MapReduce can use CCDA to help the run-time system to distribute appropriate numbers of tasks over computers in a cloud at run time without any manual configuration made by a cloud administrator. According to experiment observations in this paper, MapReduce can get great performance improvement with the help of CCDA in data-intensive applications such as Inverted Index and Word Count that are usually required to process big data on cloud computing.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121709673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}