Zhengzi Xu, Bihuan Chen, Mahinthan Chandramohan, Yang Liu, Fu Song
{"title":"SPAIN: Security Patch Analysis for Binaries towards Understanding the Pain and Pills","authors":"Zhengzi Xu, Bihuan Chen, Mahinthan Chandramohan, Yang Liu, Fu Song","doi":"10.1109/ICSE.2017.49","DOIUrl":null,"url":null,"abstract":"Software vulnerability is one of the major threats to software security. Once discovered, vulnerabilities are often fixed by applying security patches. In that sense, security patches carry valuable information about vulnerabilities, which could be used to discover, understand and fix (similar) vulnerabilities. However, most existing patch analysis approaches work at the source code level, while binary-level patch analysis often heavily relies on a lot of human efforts and expertise. Even worse, some vulnerabilities may be secretly patched without applying CVE numbers, or only the patched binary programs are available while the patches are not publicly released. These practices greatly hinder patch analysis and vulnerability analysis. In this paper, we propose a scalable binary-level patch analysis framework, named SPAIN, which can automatically identify security patches and summarize patch patterns and their corresponding vulnerability patterns. Specifically, given the original and patched versions of a binary program, we locate the patched functions and identify the changed traces (i.e., a sequence of basic blocks) that may contain security or non-security patches. Then we identify security patches through a semantic analysis of these traces and summarize the patterns through a taint analysis on the patched functions. The summarized patterns can be used to search similar patches or vulnerabilities in binary programs. Our experimental results on several real-world projects have shown that: i) SPAIN identified security patches with high accuracy and high scalability, ii) SPAIN summarized 5 patch patterns and their corresponding vulnerability patterns for 5 vulnerability types, and iii) SPAIN discovered security patches that were not documented, and discovered 3 zero-day vulnerabilities.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"179 1","pages":"462-472"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"117","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 117
Abstract
Software vulnerability is one of the major threats to software security. Once discovered, vulnerabilities are often fixed by applying security patches. In that sense, security patches carry valuable information about vulnerabilities, which could be used to discover, understand and fix (similar) vulnerabilities. However, most existing patch analysis approaches work at the source code level, while binary-level patch analysis often heavily relies on a lot of human efforts and expertise. Even worse, some vulnerabilities may be secretly patched without applying CVE numbers, or only the patched binary programs are available while the patches are not publicly released. These practices greatly hinder patch analysis and vulnerability analysis. In this paper, we propose a scalable binary-level patch analysis framework, named SPAIN, which can automatically identify security patches and summarize patch patterns and their corresponding vulnerability patterns. Specifically, given the original and patched versions of a binary program, we locate the patched functions and identify the changed traces (i.e., a sequence of basic blocks) that may contain security or non-security patches. Then we identify security patches through a semantic analysis of these traces and summarize the patterns through a taint analysis on the patched functions. The summarized patterns can be used to search similar patches or vulnerabilities in binary programs. Our experimental results on several real-world projects have shown that: i) SPAIN identified security patches with high accuracy and high scalability, ii) SPAIN summarized 5 patch patterns and their corresponding vulnerability patterns for 5 vulnerability types, and iii) SPAIN discovered security patches that were not documented, and discovered 3 zero-day vulnerabilities.