JVM配置管理及其对大数据应用的性能影响

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI:10.1109/BigDataCongress.2016.64

S. Sahin, Wenqi Cao, Qi Zhang, Ling Liu

{"title":"JVM配置管理及其对大数据应用的性能影响","authors":"S. Sahin, Wenqi Cao, Qi Zhang, Ling Liu","doi":"10.1109/BigDataCongress.2016.64","DOIUrl":null,"url":null,"abstract":"Big data applications are typically programmed using garbage collected languages, such as Java, in order to take advantage of garbage collected memory management, instead of explicit and manual management of application memory, e.g., dangling pointers, memory leaks, dead objects. However, application performance in Java like garbage collected languages is known to be highly correlated with the heap size and performance of language runtime such as Java Virtual Machine (JVM). Although different heap resizing techniques and garbage collection algorithms are proposed, most of existing solutions require modification to JVM, guest OS kernel, host OS kernel or hypervisor. In this paper, we evaluate and analyze the effects of tuning JVM heap structure and garbage collection parameters on application performance, without requiring any modification to JVM, guest OS, host OS and hypervisor. Our extensive measurement study shows a number of interesting observations: (i) Increasing heap size may not increase application performance for all cases and at all times, (ii) Heap space error may not necessarily indicate that heap is full, (iii) Heap space errors can be resolved by tuning heap structure parameters without enlarging heap, and (iv) JVM of small heap sizes may achieve the same application performance by tuning JVM heap structure and GC parameters without any modification to JVM, VM and OS kernel. We conjecture that these results can help software developers of big data applications to achieve high performance big data computing by better management and configuration of their JVM runtime.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"JVM Configuration Management and Its Performance Impact for Big Data Applications\",\"authors\":\"S. Sahin, Wenqi Cao, Qi Zhang, Ling Liu\",\"doi\":\"10.1109/BigDataCongress.2016.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Big data applications are typically programmed using garbage collected languages, such as Java, in order to take advantage of garbage collected memory management, instead of explicit and manual management of application memory, e.g., dangling pointers, memory leaks, dead objects. However, application performance in Java like garbage collected languages is known to be highly correlated with the heap size and performance of language runtime such as Java Virtual Machine (JVM). Although different heap resizing techniques and garbage collection algorithms are proposed, most of existing solutions require modification to JVM, guest OS kernel, host OS kernel or hypervisor. In this paper, we evaluate and analyze the effects of tuning JVM heap structure and garbage collection parameters on application performance, without requiring any modification to JVM, guest OS, host OS and hypervisor. Our extensive measurement study shows a number of interesting observations: (i) Increasing heap size may not increase application performance for all cases and at all times, (ii) Heap space error may not necessarily indicate that heap is full, (iii) Heap space errors can be resolved by tuning heap structure parameters without enlarging heap, and (iv) JVM of small heap sizes may achieve the same application performance by tuning JVM heap structure and GC parameters without any modification to JVM, VM and OS kernel. We conjecture that these results can help software developers of big data applications to achieve high performance big data computing by better management and configuration of their JVM runtime.\",\"PeriodicalId\":407471,\"journal\":{\"name\":\"2016 IEEE International Congress on Big Data (BigData Congress)\",\"volume\":\"107 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Congress on Big Data (BigData Congress)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BigDataCongress.2016.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Congress on Big Data (BigData Congress)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BigDataCongress.2016.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

大数据应用程序通常使用垃圾收集语言(如Java)编程，以便利用垃圾收集内存管理，而不是对应用程序内存进行显式和手动管理，例如悬空指针、内存泄漏、死对象。然而，众所周知，Java(如垃圾收集语言)中的应用程序性能与堆大小和语言运行时(如Java虚拟机(JVM))的性能高度相关。虽然提出了不同的堆大小调整技术和垃圾收集算法，但大多数现有的解决方案都需要修改JVM、来宾操作系统内核、主机操作系统内核或管理程序。在本文中，我们评估和分析了调优JVM堆结构和垃圾收集参数对应用程序性能的影响，而无需对JVM、客户机操作系统、主机操作系统和管理程序进行任何修改。我们广泛的测量研究显示了一些有趣的观察结果:(i)增加堆大小可能不会在所有情况下和任何时候都提高应用程序的性能，(ii)堆空间错误不一定表明堆已满，(iii)堆空间错误可以通过调整堆结构参数来解决，而无需扩大堆，(iv)小堆大小的JVM可以通过调整JVM堆结构和GC参数来实现相同的应用程序性能，而无需修改JVM, VM和OS内核。我们推测，这些结果可以帮助大数据应用程序的软件开发人员通过更好地管理和配置JVM运行时来实现高性能的大数据计算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

JVM Configuration Management and Its Performance Impact for Big Data Applications

Big data applications are typically programmed using garbage collected languages, such as Java, in order to take advantage of garbage collected memory management, instead of explicit and manual management of application memory, e.g., dangling pointers, memory leaks, dead objects. However, application performance in Java like garbage collected languages is known to be highly correlated with the heap size and performance of language runtime such as Java Virtual Machine (JVM). Although different heap resizing techniques and garbage collection algorithms are proposed, most of existing solutions require modification to JVM, guest OS kernel, host OS kernel or hypervisor. In this paper, we evaluate and analyze the effects of tuning JVM heap structure and garbage collection parameters on application performance, without requiring any modification to JVM, guest OS, host OS and hypervisor. Our extensive measurement study shows a number of interesting observations: (i) Increasing heap size may not increase application performance for all cases and at all times, (ii) Heap space error may not necessarily indicate that heap is full, (iii) Heap space errors can be resolved by tuning heap structure parameters without enlarging heap, and (iv) JVM of small heap sizes may achieve the same application performance by tuning JVM heap structure and GC parameters without any modification to JVM, VM and OS kernel. We conjecture that these results can help software developers of big data applications to achieve high performance big data computing by better management and configuration of their JVM runtime.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Congress on Big Data (BigData Congress)

自引率

0.00%

发文量