{"title":"Micro-architectural anatomy of a commercial TCP/IP stack","authors":"R. Illikkal, R. Iyer, D. Newell","doi":"10.1109/WWC.2004.1437394","DOIUrl":null,"url":null,"abstract":"Over the last couple of decades, computer architects and performance analysts have routinely attempted to profile the overhead of TCP/IP processing in an effort to understand where the time was spent. It is well understood that this is a rather difficult problem since the processing time is spread across various software modules such as the network stack, interrupt routines, drivers, O/S scheduler, etc. As a result, the problem of extracting the micro-architectural characteristics of TCP/IP processing is significantly more challenging. In this paper, we start by covering the previous attempts at this problem and show what existing tools can provide in terms of execution time characteristics. We then propose a detailed methodology that combines full-system simulation, cycle-accurate performance simulations and symbol annotation to provide a rich cycle-accurate view of TCP/IP packet processing execution. We discuss initial results based on our profiling methodology and discuss where the time is spent. This includes an analysis of micro-architectural characteristics (such as instruction breakdown, CPI, MPI and TLB misses on a state-of-the-art microprocessor).","PeriodicalId":240633,"journal":{"name":"IEEE International Workshop on Workload Characterization, 2004. WWC-7. 2004","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Workshop on Workload Characterization, 2004. WWC-7. 2004","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WWC.2004.1437394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Over the last couple of decades, computer architects and performance analysts have routinely attempted to profile the overhead of TCP/IP processing in an effort to understand where the time was spent. It is well understood that this is a rather difficult problem since the processing time is spread across various software modules such as the network stack, interrupt routines, drivers, O/S scheduler, etc. As a result, the problem of extracting the micro-architectural characteristics of TCP/IP processing is significantly more challenging. In this paper, we start by covering the previous attempts at this problem and show what existing tools can provide in terms of execution time characteristics. We then propose a detailed methodology that combines full-system simulation, cycle-accurate performance simulations and symbol annotation to provide a rich cycle-accurate view of TCP/IP packet processing execution. We discuss initial results based on our profiling methodology and discuss where the time is spent. This includes an analysis of micro-architectural characteristics (such as instruction breakdown, CPI, MPI and TLB misses on a state-of-the-art microprocessor).