Bytecode fetch optimization for a Java interpreter

ASPLOS X Pub Date : 2002-10-01 DOI:10.1145/605397.605404
Kazunori Ogata, H. Komatsu, T. Nakatani
{"title":"Bytecode fetch optimization for a Java interpreter","authors":"Kazunori Ogata, H. Komatsu, T. Nakatani","doi":"10.1145/605397.605404","DOIUrl":null,"url":null,"abstract":"Interpreters play an important role in many languages, and their performance is critical particularly for the popular language Java. The performance of the interpreter is important even for high-performance virtual machines that employ just-in-time compiler technology, because there are advantages in delaying the start of compilation and in reducing the number of the target methods to be compiled. Many techniques have been proposed to improve the performance of various interpreters, but none of them has fully addressed the issues of minimizing redundant memory accesses and the overhead of indirect branches inherent to interpreters running on superscalar processors. These issues are especially serious for Java because each bytecode is typically one or a few bytes long and the execution routine for each bytecode is also short due to the low-level, stack-based semantics of Java bytecode. In this paper, we describe three novel techniques of our Java bytecode interpreter, write-through top-of-stack caching (WT), position-based handler customization (PHC), and position-based speculative decoding (PSD), which ameliorate these problems for the PowerPC processors. We show how each technique contributes to improving the overall performance of the interpreter for major Java benchmark programs on an IBM POWER3 processor. Among three, PHC is the most effective one. We also show that the main source of memory accesses is due to bytecode fetches and that PHC successfully eliminates the majority of them, while it keeps the instruction cache miss ratios small.","PeriodicalId":377379,"journal":{"name":"ASPLOS X","volume":"1991 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASPLOS X","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/605397.605404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Interpreters play an important role in many languages, and their performance is critical particularly for the popular language Java. The performance of the interpreter is important even for high-performance virtual machines that employ just-in-time compiler technology, because there are advantages in delaying the start of compilation and in reducing the number of the target methods to be compiled. Many techniques have been proposed to improve the performance of various interpreters, but none of them has fully addressed the issues of minimizing redundant memory accesses and the overhead of indirect branches inherent to interpreters running on superscalar processors. These issues are especially serious for Java because each bytecode is typically one or a few bytes long and the execution routine for each bytecode is also short due to the low-level, stack-based semantics of Java bytecode. In this paper, we describe three novel techniques of our Java bytecode interpreter, write-through top-of-stack caching (WT), position-based handler customization (PHC), and position-based speculative decoding (PSD), which ameliorate these problems for the PowerPC processors. We show how each technique contributes to improving the overall performance of the interpreter for major Java benchmark programs on an IBM POWER3 processor. Among three, PHC is the most effective one. We also show that the main source of memory accesses is due to bytecode fetches and that PHC successfully eliminates the majority of them, while it keeps the instruction cache miss ratios small.
Java解释器的字节码获取优化
解释器在许多语言中都扮演着重要的角色,它们的性能非常关键,尤其是对于流行的语言Java。即使对于使用即时编译器技术的高性能虚拟机,解释器的性能也很重要,因为延迟编译的开始时间和减少要编译的目标方法的数量是有好处的。已经提出了许多技术来提高各种解释器的性能,但是没有一种技术能够完全解决最小化冗余内存访问和运行在超标量处理器上的解释器固有的间接分支开销的问题。这些问题对于Java来说尤其严重,因为每个字节码通常是一个或几个字节长,而且由于Java字节码的低级、基于堆栈的语义,每个字节码的执行例程也很短。在本文中,我们描述了Java字节码解释器的三种新技术,即透写堆栈顶缓存(WT)、基于位置的处理程序定制(PHC)和基于位置的推测解码(PSD),它们改善了PowerPC处理器的这些问题。我们将展示每种技术如何有助于提高IBM POWER3处理器上主要Java基准程序的解释器的整体性能。其中,PHC是最有效的。我们还表明,内存访问的主要来源是字节码获取,PHC成功地消除了其中的大部分,同时使指令缓存丢失率保持在较小的水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信