{"title":"Self-optimization of power parameters in WCDMA networks","authors":"Harrison Mfula, T. Isotalo, J. Nurminen","doi":"10.1109/HPCSim.2015.7237024","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237024","url":null,"abstract":"Network optimization is used by operators to maximize return on investment and to ensure customer satisfaction with the quality of the delivered service. Coverage and capacity are the most important characteristics of any cellular network. In WCDMA networks, the pilot signal of a cell is used to determine the cell size, hence it can be used to determine the coverage area of the cell. Increasing or reducing the cell pilot power increases or reduces the cell size respectively and hence pilot power can be used to balance load among neighboring cells. As networks continue to evolve, the frequency of optimization and number of tunable parameters continues to increase making manual optimization challenging. This paper presents a practical solution to the pilot power optimization problem in WCDMA networks and addresses the issue of rising optimization complexity by presenting a self-optimization based algorithm for tuning pilot power. When running in closed loop, the algorithm can be used to autonomously optimize pilot power and load balance traffic in the network. When scheduled or triggered manually, the algorithm can also be used to improve network capacity in areas expecting high traffic load during a certain time for example during social gatherings.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130999645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPGPU performance evaluation of some basic molecular dynamics algorithms","authors":"A. Minkin, A. Teslyuk, A. Knizhnik, B. Potapkin","doi":"10.1109/HPCSim.2015.7237104","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237104","url":null,"abstract":"Molecular dynamics is a computationally intensive problem but it is extremely amenable for parallel computation. Many-body potentials used for modeling of carbon and metallic nanostructures usually require much more computing resources than pair potentials. One of the ways to improve their performance is to transform them for running on computing systems that combines CPU and GPU. In this work OpenCL performance of basic molecular dynamics algorithms such as neighbor list generation along with different implementations of energy-force computation of Lennard-Jones, Tersoff and EAM potentials is evaluated. It is shown that concurrent memory writes are effective for Tersoff bond order potential and are not good for embedded-atom potential. Performance measurements show a significant GPU acceleration of basic molecular dynamics algorithms over the corresponding serial implementations.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115496994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edwin L. C. Mamani, L. A. P. Júnior, M. J. Santana, R. Santana, Pedro Northon Nobile, F. J. Monaco
{"title":"Transient performance evaluation of cloud computing applications and dynamic resource control in large-scale distributed systems","authors":"Edwin L. C. Mamani, L. A. P. Júnior, M. J. Santana, R. Santana, Pedro Northon Nobile, F. J. Monaco","doi":"10.1109/HPCSim.2015.7237046","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237046","url":null,"abstract":"This paper discusses on non-stationary performance evaluation and dynamic modeling of cloud computing environments. In computer systems, dynamic effects results from the filling of buffers, event-handling delays, non-deterministic I/O response times, network latency, among other factors. While computer systems performance evaluation under stationary workloads have met the needs of many engineering problems, new challenges arise as the deployment of increasingly complex and large-scale distributed systems becomes commonplace. One key aspect of this discussion is that transient analysis models how the system reacts to changes in the workload and may reveal that the resources necessary to support a high steady-state workload may not be sufficient to handle a small, but sudden, workload change, even of intensity far smaller than that supported by the system's stationary capacity. This article elaborates on these issues under a control-theoretical approach.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114211872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient storage scheme for n-dimensional sparse array: GCRS/GCCS","authors":"Md Abu Hanif Shaikh, K. Hasan","doi":"10.1109/HPCSim.2015.7237032","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237032","url":null,"abstract":"Degree of data sparsity increases with the increase of number of dimensions in high performance scientific computing. Storing and applying operations on this highly sparse multidimensional data is still a challenge for data scientists. Experts suggest special storage scheme over sparse array. In traditional sparse array storage scheme, (n+l) one dimensional arrays are necessary to store n-dimensional array. In this paper, we propose `Generalized Row/Column Storage (GCRS/GCCS)' scheme which requires three one dimensional arrays only for storing a n-dimensional array. The superiority of the GCRS/GCCS over traditional Compressed Row/Column Storage (CRS/CCS) is shown by both theoretical analysis and experimental results. In theoretical analysis, we derive equations for space and time complexity as well as the range of usability for GCRS/GCCS. It is shown that the GCRS/GCCS scheme yields to support minimum 50% data density where as the range of usability is inversely proportional with the number of dimensions for CRS/CCS scheme. The experimental result shows that the proposed GCRS/GCCS scheme outperforms the CRS/CCS scheme with respect to space complexity, time complexity and range of usability.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117179962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on Information Flow Control mechanisms in web applications","authors":"Oscar Zibordi de Paiva, W. Ruggiero","doi":"10.1109/HPCSim.2015.7237042","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237042","url":null,"abstract":"Web applications are nowadays ubiquitous channels that provide access to valuable information. However, web application security remains problematic, with Information Leakage, Cross-Site Scripting and SQL-Injection vulnerabilities - which all present threats to information - standing among the most common ones. On the other hand, Information Flow Control is a mature and well-studied area, providing techniques to ensure the confidentiality and integrity of information. Thus, numerous works were made proposing the use of these techniques to improve web application security. This paper provides a survey on some of these works that propose server-side only mechanisms, which operate in association with standard browsers. It also provides a brief overview of the information flow control techniques themselves. At the end, we draw a comparative scenario between the surveyed works, highlighting the environments for which they were designed and the security guarantees they provide, also suggesting directions in which they may evolve.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131467341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance evaluation of Data Mining algorithms on three generations of Intel® microarchitecture","authors":"S. Sadasivam, S. Selvi","doi":"10.1109/HPCSim.2015.7237059","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237059","url":null,"abstract":"Data Mining algorithms and machine learning techniques form a key part of the majority of computing applications today. They are becoming an inherent part of business decision processes, e-commerce, social networking and social media applications as well as commercial and scientific computing applications. It is becoming increasingly important to provide a high performance computing platform for these emerging data mining applications. In this paper we explore the performance characteristics of the data mining benchmark suite MineBench across three “tock” generations of Intel microarchitecture. Our objective is to study the impact of microarchitecture improvements on the performance of data mining algorithms. We present comparative microarchitecture characteristics between data mining algorithms and SPEC INT 2006 benchmarks. We have proposed a generic cycle accounting methodology to attribute performance improvements to various units of the microprocessor. The proposed methodology helps differentiate the impact on performance due to front-end and back-end microarchitecture improvements.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127584179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilization of room-to-room transition time in Wi-Fi fingerprint-based indoor localization","authors":"Isil Karabey, Levent Bayindir","doi":"10.1109/HPCSim.2015.7237056","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237056","url":null,"abstract":"In indoor localization applications, many different methods have been proposed to increase positioning accuracy. Among these methods, fingerprint-based techniques are generally preferred because they use existing resources such as Wi-Fi, Bluetooth, FM signals, etc., and can be implemented on commonly used devices such as mobile phones. In this paper, we evaluate different Wi-Fi fingerprint-based methods on two datasets (with and without room-to-room transition features) created from the same environment, and we investigate the impact of room-to-room transition features on classification performance. To the best of our knowledge, transition time between rooms has not been used in past studies on fingerprint-based indoor localization. This information is of significant importance, due to the physical distance between rooms. Therefore, in this study source room and transition time to a target room have been included as features in addition to signal sources and signal strength values in the target room. From preliminary experimental results we observed that the transition time between rooms increases the performance of all tested positioning algorithms, with the Back-propagation classifier showing the best performance increase (13%).","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129819335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying domain decomposition Schwarz method to accelerate wind field calculation","authors":"Gemma Sanjuan, T. Margalef, A. Cortés","doi":"10.1109/HPCSim.2015.7237080","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237080","url":null,"abstract":"Wind field is a critical issue in forest fire propagation prediction. However, wind field calculation is a complex problem that for large terrains involves solving huge linear systems. Solving such systems takes too much time and makes the approach unfeasible in real time operation. To overcome this problem the Schwarz alternating domain decomposition can be applied. Using this method the linear system is decomposed in a set of overlapped subdomains that can be solved in parallel using a Master/Worker paradigm and the wind field calculation time can be significantly reduced.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120961268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and scalable NUMA-based thread parallel breadth-first search","authors":"Yuichiro Yasui, K. Fujisawa","doi":"10.1109/HPCSim.2015.7237065","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237065","url":null,"abstract":"The breadth-first search (BFS) is one of the most centric kernels in graph processing. Beamer's direction-optimizing BFS algorithm, which selects one of two traversal directions at each level, can reduce unnecessary edge traversals. In a previous paper, we presented an efficient BFS for a non-uniform memory access (NUMA)-based system, in which the NUMA architecture was carefully considered. In this paper, we investigate the locality of memory accesses in terms of the communication with remote memories in a BFS for a NUMA system, and describe a fast and highly scalable implementation. Our new implementation achieves performance rates of 174.704 billion edges per second for a Kronecker graph with 233 vertices and 237 edges on two racks of a SGI UV 2000 system with 1,280 threads. The implementations described in this paper achieved the fastest entries for a shared-memory system in the June 2014 and November 2014 Graph500 lists, and produced the most energy-efficient entries in the second, third, and fourth Green Graph500 lists (big data category).","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127850134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Batched DOACROSS loop parallelization algorithm","authors":"D. C. S. Lucas, G. Araújo","doi":"10.1109/HPCSim.2015.7237079","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237079","url":null,"abstract":"Parallelizing loops containing loop-carried dependencies has been considered a very difficult task, mainly due to the overhead imposed by communicating dependencies between iterations. Despite the huge effort to devise effective parallelization techniques for such loops, the problem is still far from solved. For many loops, old (DOACROSS), and new (DSWP) techniques have not been able to offer a solution to this problem. This paper does a qualitative and quantitative analysis of synchronization costs of these two loop parallelization algorithms, on two modern computer architectures (ARM A9 MPCore and Intel Ivy Bridge). Our results show that at least 30% of the execution time of the programs we parallelized are spent on synchronization/data communication. We also show that, besides the problem being hard, these architectures are on opposite endpoints along the axis of commonly accepted requisites for efficient loop parallelization. As a consequence, both techniques struggle to effectively speed up several programs. Moreover, this paper presents a novel algorithm, called Batched DOACROSS (BDX), that capitalizes on the advantages of DSWP and DOACROSS, while minimizing their deficiencies. BDX does not require new hardware mechanisms (as DSWP does) and makes use of thread local buffers to reduce DOACROSS synchronization overheads.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121762949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}