{"title":"A unified scaling model in the era of big data analytics","authors":"Zhongwei Li, Feng Duan, Hao Che","doi":"10.1145/3318265.3318268","DOIUrl":"https://doi.org/10.1145/3318265.3318268","url":null,"abstract":"As scale-out execution of big data analytics has become predominate datacenter workloads, it is of paramount importance to faithfully characterize the scaling properties for such workloads. To date, the most widely cited scaling laws for big data analytics is the traditional Amdahl's law, which was discovered well before the era of big data analytics. A key observation made in this paper is that both the system and workload models underlying the traditional scaling laws are too simplistic to fully characterize the scaling properties for big data analytics workloads. In this paper, we put forward a Unified Scaling model for Big data Analytics (USBA), based on a multi-stage system model and a discretized workload model. USBA allows for flexible workload scaling unifying the fixed-size and fixed-time workload models underlying Amdahl's and Gustafson's laws, respectively, and flexible system scaling in terms of both number of stages and degree of parallelism per stage. Moreover, to faithfully characterize the scaling properties for big data analytics workloads, USBA accounts for variabilities of task response times and barrier synchronization. Finally, application of USBA to the scaling analysis of four Spark-based data mining and graph benchmarks demonstrates that USBA is able to adequately characterize the scaling design space and predict the scaling properties of real-world big data analytics workloads. This makes it possible to use USBA as a useful tool to facilitate job resource provisioning for big data analytics in datacenters.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123131614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of co-operative MIMO channel over sensor control networks","authors":"Summera Shamrooz, Qianmu Li","doi":"10.1145/3318265.3318286","DOIUrl":"https://doi.org/10.1145/3318265.3318286","url":null,"abstract":"In cooperative networks, to reduce the interference communication among the coordination of cells a relays setup introduced between multiple inputs multiple output (MIMO) designs. In this paper, Cooperative MIMO (C-MIMO) is utilized for efficient energy technologies. For sake of best performance of Wireless the sensor Networks (WSNs), the bit inter-leaved coded modulation (BICM) and bit inter-leaved coded modulation with iterative decoding (BICM-ID) code are utilized. The simulation results demonstrate that for BICM and BICM-ID the cooperative communication scheme exceeds the single input single output (SISO) technique. Under the different statistical distributions over the WSNs the bit error rate (BER) is analyzed for the system.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121147204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OPS: an optimized partial stripe write scheme to improve performance of XOR-based disk arrays tolerating triple disk failures","authors":"Xunsong Huang, Chentao Wu, Jie Li","doi":"10.1145/3318265.3318274","DOIUrl":"https://doi.org/10.1145/3318265.3318274","url":null,"abstract":"In cloud storage and big data processing systems, RAID especially disk arrays tolerating triple disk failures (3DFTs) is a popular choice to provide high reliability with low monetary cost. For 3DFTs, a key obstacle is the low partial stripe write performance, which is caused by large amount of parity modifications based on complex erasure coding layouts. In order to solve this problem, in this paper, we propose an optimized partial stripe write (OPS) method, which reorganizes the distribution of write data blocks to share partial parities among data blocks, thereby improving overall I/O performance. The OPS method can effectively reduce the number of modified parities. To illustrate the effectiveness of our OPS method, we used Disksim to evaluate several different partial stripe write methods through simulation. The results show that OPS can reduce the average response time by up to 37.21% and decreases the number of write operations by up to 26.22% compared to the traditional partial strip writing method..","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114797181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liwei Huang, W. Shao, Yan Zhang, Jun-Jie Yang, Yaxiang Liu
{"title":"A radio environment map construction scheme with hidden Markov Model based spectrum occupancy prediction","authors":"Liwei Huang, W. Shao, Yan Zhang, Jun-Jie Yang, Yaxiang Liu","doi":"10.1145/3318265.3318284","DOIUrl":"https://doi.org/10.1145/3318265.3318284","url":null,"abstract":"A radio environment map (REM) construction scheme with hidden Markov model (HMM) based spectrum occupancy prediction is presented in this paper. The predicted spectrum occupancy state can be shown visually in the REM, which is used to find the opportunity to access a certain channel. Firstly, the HMM is studied to predict the spectrum occupancy state at the future time in a unsupervised manner. Secondly, according to the predicted result from the HMM, we construct the REM by the extrapolation method based on wireless propagation loss models to reflect the potential spectrum occupancy status and make a decision on accessing the channel or not. Simulation results show that the proposed scheme has a good prediction performance and the constructed REM is helpful for the secondary user to access the desired channel.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130204195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-stage population based training method for deep reinforcement learning","authors":"Yinda Zhou, W. Liu, Bin Li","doi":"10.1145/3318265.3318294","DOIUrl":"https://doi.org/10.1145/3318265.3318294","url":null,"abstract":"Deep reinforcement learning (DRL) methods has been widely applied on more and more challenging learning tasks, and achieved excellent performance. However, the efficiency of deep reinforcement learning is notoriously sensitive to their own hyperparameter configuration. The optimization process of deep reinforcement learning is highly dynamic and non-stationary, rather than a simple fitting process. So, its optimal hyperparameter should be adaptively adjusted according to the current learning process, rather than using a fixed set of hyperparameter configurations from beginning to end. DeepMind innovatively proposed a population based training (PBT) method for deep reinforcement learning, which achieved hyperparameter adaptation and made the model better trained. However, we assume that at the early stage when the learning model has little knowledge of the environment, frequent hyperparameter change will not be helpful for the model to learn efficiently, while learning with a reasonable fixed hyperparameter configuration will help the model obtain necessary knowledge as quick as possible, which we consider is more important for reinforcement learning at early stage. In this paper, we verified our hypothesis through experiments, and a Two-Stage Population Based Training (TS-PBT) method is proposed, which is a more efficient population based training method for deep reinforcement learning. Experiments show that at the same computational budget, our TS-PBT method makes the final performance of the model significantly better than the PBT method. TS-PBT achieved 40%, 310%, 2%, 53%, 30% and 38% performance improvement over PBT separately in six test environments.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121950329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A parallel clustering algorithm for logs data based on Hadoop platform","authors":"J. Huo, Jia-Yow Weng, Hong Qu","doi":"10.1145/3318265.3318281","DOIUrl":"https://doi.org/10.1145/3318265.3318281","url":null,"abstract":"Log analysis is an important method to reflect the running status and user behavior of the network system, and is also an important way to ensure network security. In view of the fact that the storage or calculation of log data by a single host can not meet the requirements of large-scale data analysis, this paper proposes a clustering method of big data based on Map/Reduce distributed computing framework for Web logs. The experiments are taken on the Hadoop platform. The relations and rules that exist in the logs are examined and analyzed to obtain the potential information. This method can enable efficient storage, management, and mining analysis for the large-scale Web logs.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122610355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic essay scoring with recurrent neural network","authors":"Changzhi Cai","doi":"10.1145/3318265.3318296","DOIUrl":"https://doi.org/10.1145/3318265.3318296","url":null,"abstract":"As deep learning has developed rapidly in recent years, the automatic essay scoring system, based on deep learning models, has become more reliable than previous feature-based systems. Recent researchers have developed an approach based on recurrent neural networks to learn the relationship between an essay and its assigned score, without any feature engineering. In this paper, we use an ASAP essay dataset, combining feature scoring and a recurrent neural network. The results show that we can compare the result of quadratic weighted Kappa of each experience to get the best model. GloVe significantly improves the results, and feature extraction can affect the result slightly. In future work, we will apply transfer learning, one-shot learning, and adversarial inputs in our model to get better performance.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131910914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data fusion algorithms for wireless sensor networks based on deep learning model","authors":"Lihong Wang, Kuiliang Xia","doi":"10.1145/3318265.3318297","DOIUrl":"https://doi.org/10.1145/3318265.3318297","url":null,"abstract":"In order to reduce the energy consumption and prolong the lifetime of wireless sensor networks (WSN), a data fusion algorithm based on deep learning model is proposed. Firstly, the algorithm completes training and clustering at the sink node, transfers the trained parameters to each cluster node, and then transfers the collected data to the sink node after feature classification, extraction and fusion. In order to make the distribution of cluster heads more uniform, the clustering method is improved on the basis of estimating the optimal number of cluster heads, which reduces the number of clusters and saves the energy consumption of the network. The simulation results show that the WSN data fusion algorithm based on deep learning model reduces the network energy consumption, prolongs the network lifetime, and is more suitable for large-scale telecommunication.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134368311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A road network matching method based on particle swarm optimization","authors":"F. Zhu, Peng-Zhong Wang","doi":"10.1145/3318265.3318282","DOIUrl":"https://doi.org/10.1145/3318265.3318282","url":null,"abstract":"With the complexity of spatial object matching between multi-source and multi-scale road networks is increasing, road network space target matching method encountered different levels of bottleneck in precision and accuracy. This paper proposes a road network matching method based on the stable spatial hierarchical structure, this method has both global and local features, it can overcome the mismatch caused by excessive dependence on local morphological structure as similarity criterion, and the matching result can also be found by fast convergence. The experimental results show that this paper combines particle swarm optimization for road network matching, it has obvious advantages in regions with similar local structures and significant global structural differences, the matching accuracy and optimization efficiency are improved obviously.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122677645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Siamese bayesian networks for AI based differential diagnosis","authors":"Monish Kaul, Nikhil S. Narayan, A. Narayanan","doi":"10.1145/3318265.3318298","DOIUrl":"https://doi.org/10.1145/3318265.3318298","url":null,"abstract":"Differential diagnosis refers to the process of differentiating between two or more conditions which share similar signs or symptoms. Classical methods such as Bayesian Networks proposed in the past to automatically obtain a differential diagnosis do not consider negative evidence for prediction and also lack the ability to model hidden influences on diseases. In order to address the shortcomings of the existing methods for automated differential diagnosis, we propose a novel Siamese Bayesian Networks that takes into consideration the absence of a symptom as a strong negative evidence to converge to the actual diagnosis. We show that the proposed algorithm has a 40% improvement over manual differential diagnosis of disorders and a 10% improvement over classical Bayesian Networks approach for differential diagnosis.","PeriodicalId":241692,"journal":{"name":"Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126280162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}