{"title":"MirrorTaint:用于基于jvm的微服务系统的实用非侵入式动态污点跟踪","authors":"Yicheng Ouyang, Kailai Shao, Kunqiu Chen, Ruobing Shen, Chao Chen, Mingze Xu, Yuqun Zhang, Lingming Zhang","doi":"10.1109/ICSE48619.2023.00210","DOIUrl":null,"url":null,"abstract":"Taint analysis, i.e., labeling data and propagating the labels through data flows, has been widely used for analyzing program information flows and ensuring system/data security. Due to its important applications, various taint analysis techniques have been proposed, including static and dynamic taint analysis. However, existing taint analysis techniques can be hardly applied to the rising microservice systems for industrial applications. To address such a problem, in this paper, we proposed the first practical non-intrusive dynamic taint analysis technique MirrorTaint for extensively supporting microservice systems on JVMs. In particular, by instrumenting the microservice systems, MirrorTaint constructs a set of data structures with their respective policies for labeling/propagating taints in its mirrored space. Such data structures are essentially non-intrusive, i.e., modifying no program meta-data or runtime system. Then, during program execution, MirrorTaint replicates the stack-based JVM instruction execution in its mirrored space on-the-fly for dynamic taint tracking. We have evaluated MirrorTaint against state-of-the-art dynamic and static taint analysis systems on various popular open-source microservice systems. The results demonstrate that MirrorTaint can achieve better compatibility, quite close precision and higher recall (97.9%/100.0%) than state-of-the-art Phosphor (100.0%/9.9%) and FlowDroid (100%/28.2%). Also, MirrorTaint incurs lower runtime overhead than Phosphor (although both are dynamic techniques). Moreover, we have performed a case study in Ant Group, a global billion-user FinTech company, to compare MirrorTaint and their mature developer-experience-based data checking system for automatically generated fund documents. The result shows that the developer experience can be incomplete, causing the data checking system to only cover 84.0% total data relations, while MirrorTaint can automatically find 99.0% relations with 100.0% precision. Lastly, we also applied MirrorTaint to successfully detect a recently wide-spread Log4j2 security vulnerability.","PeriodicalId":376379,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MirrorTaint: Practical Non-intrusive Dynamic Taint Tracking for JVM-based Microservice Systems\",\"authors\":\"Yicheng Ouyang, Kailai Shao, Kunqiu Chen, Ruobing Shen, Chao Chen, Mingze Xu, Yuqun Zhang, Lingming Zhang\",\"doi\":\"10.1109/ICSE48619.2023.00210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Taint analysis, i.e., labeling data and propagating the labels through data flows, has been widely used for analyzing program information flows and ensuring system/data security. Due to its important applications, various taint analysis techniques have been proposed, including static and dynamic taint analysis. However, existing taint analysis techniques can be hardly applied to the rising microservice systems for industrial applications. To address such a problem, in this paper, we proposed the first practical non-intrusive dynamic taint analysis technique MirrorTaint for extensively supporting microservice systems on JVMs. In particular, by instrumenting the microservice systems, MirrorTaint constructs a set of data structures with their respective policies for labeling/propagating taints in its mirrored space. Such data structures are essentially non-intrusive, i.e., modifying no program meta-data or runtime system. Then, during program execution, MirrorTaint replicates the stack-based JVM instruction execution in its mirrored space on-the-fly for dynamic taint tracking. We have evaluated MirrorTaint against state-of-the-art dynamic and static taint analysis systems on various popular open-source microservice systems. The results demonstrate that MirrorTaint can achieve better compatibility, quite close precision and higher recall (97.9%/100.0%) than state-of-the-art Phosphor (100.0%/9.9%) and FlowDroid (100%/28.2%). Also, MirrorTaint incurs lower runtime overhead than Phosphor (although both are dynamic techniques). Moreover, we have performed a case study in Ant Group, a global billion-user FinTech company, to compare MirrorTaint and their mature developer-experience-based data checking system for automatically generated fund documents. The result shows that the developer experience can be incomplete, causing the data checking system to only cover 84.0% total data relations, while MirrorTaint can automatically find 99.0% relations with 100.0% precision. Lastly, we also applied MirrorTaint to successfully detect a recently wide-spread Log4j2 security vulnerability.\",\"PeriodicalId\":376379,\"journal\":{\"name\":\"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE48619.2023.00210\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE48619.2023.00210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MirrorTaint: Practical Non-intrusive Dynamic Taint Tracking for JVM-based Microservice Systems
Taint analysis, i.e., labeling data and propagating the labels through data flows, has been widely used for analyzing program information flows and ensuring system/data security. Due to its important applications, various taint analysis techniques have been proposed, including static and dynamic taint analysis. However, existing taint analysis techniques can be hardly applied to the rising microservice systems for industrial applications. To address such a problem, in this paper, we proposed the first practical non-intrusive dynamic taint analysis technique MirrorTaint for extensively supporting microservice systems on JVMs. In particular, by instrumenting the microservice systems, MirrorTaint constructs a set of data structures with their respective policies for labeling/propagating taints in its mirrored space. Such data structures are essentially non-intrusive, i.e., modifying no program meta-data or runtime system. Then, during program execution, MirrorTaint replicates the stack-based JVM instruction execution in its mirrored space on-the-fly for dynamic taint tracking. We have evaluated MirrorTaint against state-of-the-art dynamic and static taint analysis systems on various popular open-source microservice systems. The results demonstrate that MirrorTaint can achieve better compatibility, quite close precision and higher recall (97.9%/100.0%) than state-of-the-art Phosphor (100.0%/9.9%) and FlowDroid (100%/28.2%). Also, MirrorTaint incurs lower runtime overhead than Phosphor (although both are dynamic techniques). Moreover, we have performed a case study in Ant Group, a global billion-user FinTech company, to compare MirrorTaint and their mature developer-experience-based data checking system for automatically generated fund documents. The result shows that the developer experience can be incomplete, causing the data checking system to only cover 84.0% total data relations, while MirrorTaint can automatically find 99.0% relations with 100.0% precision. Lastly, we also applied MirrorTaint to successfully detect a recently wide-spread Log4j2 security vulnerability.