{"title":"Performance modeling and prediction for scientific Java applications","authors":"Rui Zhang, Zoran Budimlic, K. Kennedy","doi":"10.1109/ISPASS.2006.1620804","DOIUrl":null,"url":null,"abstract":"With the expansion of the Internet, the grid has become an attractive platform for scientific computing. Java, with a platform-independent execution model and built-in support for distributed computing is an inviting choice for implementation of applications intended for grid execution. Recent work has shown that an accurate performance model combined with a load-balancing scheduling strategy can significantly improve the performance of distributed applications on a heterogeneous computing platform, such as the grid. However, current performance modeling techniques are not suitable for Java applications, as the virtual machine execution model presents several difficulties: 1) a significant amount of time is spent on compilation at the beginning of the execution, 2) the virtual machine continuously profiles and recompiles the code during the execution, 3) garbage collection can have unpredictable effects on memory hierarchy, 4) some applications can spend more time garbage collecting than computing for certain heap sizes and 5) small variations in virtual machine implementation can have a large impact on the application's behavior. In this paper, we present a practical profile-based strategy for performance modeling of Java scientific applications intended for execution on the grid. We introduce two novel concepts for the Java execution model: point of predictability (PoP) and point of unpredictability (PoU). PoP accounts for the volatile nature of the effects of the virtual machine on execution time for small problem sizes. PoU accounts for the effects of garbage collection on certain applications that have a memory footprint that approaches the total heap size. We present an algorithm for determining PoP and PoU for Java applications, given the hardware platform, virtual machine and heap size. We also present a code-instrumentation-based mechanism for building the algorithm complexity model for a given application. We introduce a technique for calibrating this model that is able to accurately predict the execution time of Java programs for problem sizes between PoP and PoU. Our preliminary experiments show that techniques can achieve load balancing with more than 90% average CPU utilization.","PeriodicalId":369192,"journal":{"name":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2006.1620804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
With the expansion of the Internet, the grid has become an attractive platform for scientific computing. Java, with a platform-independent execution model and built-in support for distributed computing is an inviting choice for implementation of applications intended for grid execution. Recent work has shown that an accurate performance model combined with a load-balancing scheduling strategy can significantly improve the performance of distributed applications on a heterogeneous computing platform, such as the grid. However, current performance modeling techniques are not suitable for Java applications, as the virtual machine execution model presents several difficulties: 1) a significant amount of time is spent on compilation at the beginning of the execution, 2) the virtual machine continuously profiles and recompiles the code during the execution, 3) garbage collection can have unpredictable effects on memory hierarchy, 4) some applications can spend more time garbage collecting than computing for certain heap sizes and 5) small variations in virtual machine implementation can have a large impact on the application's behavior. In this paper, we present a practical profile-based strategy for performance modeling of Java scientific applications intended for execution on the grid. We introduce two novel concepts for the Java execution model: point of predictability (PoP) and point of unpredictability (PoU). PoP accounts for the volatile nature of the effects of the virtual machine on execution time for small problem sizes. PoU accounts for the effects of garbage collection on certain applications that have a memory footprint that approaches the total heap size. We present an algorithm for determining PoP and PoU for Java applications, given the hardware platform, virtual machine and heap size. We also present a code-instrumentation-based mechanism for building the algorithm complexity model for a given application. We introduce a technique for calibrating this model that is able to accurately predict the execution time of Java programs for problem sizes between PoP and PoU. Our preliminary experiments show that techniques can achieve load balancing with more than 90% average CPU utilization.