Cainã Figueiredo, João Gabriel Lopes, R. Azevedo, Gerson Zaverucha, D. Menasché, Leandro Pfleger De Aguiar
{"title":"Software Vulnerabilities, Products and Exploits: A Statistical Relational Learning Approach","authors":"Cainã Figueiredo, João Gabriel Lopes, R. Azevedo, Gerson Zaverucha, D. Menasché, Leandro Pfleger De Aguiar","doi":"10.1109/CSR51186.2021.9527984","DOIUrl":null,"url":null,"abstract":"Data on software vulnerabilities, products and exploits is typically collected from multiple non-structured sources. Valuable information, e.g., on which products are affected by which exploits, is conveyed by matching data from those sources, i.e., through their relations. In this paper, we leverage this simple albeit unexplored observation to introduce a statistical relational learning (SRL) approach for the analysis of vulnerabilities, products and exploits. In particular, we focus on the problem of determining the existence of an exploit for a given product, given information about the relations between products and vulnerabilities, and vulnerabilities and exploits, focusing on Industrial Control Systems (ICS), the National Vulnerability Database and ExploitDB. Using RDN-Boost, we were able to reach an AUC ROC of 0.83 and an AUC PR of 0.69 for the problem at hand. To reach that performance, we indicate that it is instrumental to include textual features, e.g., extracted from the description of vulnerabilities, as well as structured information, e.g., about product categories. In addition, using interpretable relational regression trees we report simple rules that shed insight on factors impacting the weaponization of ICS products.","PeriodicalId":253300,"journal":{"name":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSR51186.2021.9527984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Data on software vulnerabilities, products and exploits is typically collected from multiple non-structured sources. Valuable information, e.g., on which products are affected by which exploits, is conveyed by matching data from those sources, i.e., through their relations. In this paper, we leverage this simple albeit unexplored observation to introduce a statistical relational learning (SRL) approach for the analysis of vulnerabilities, products and exploits. In particular, we focus on the problem of determining the existence of an exploit for a given product, given information about the relations between products and vulnerabilities, and vulnerabilities and exploits, focusing on Industrial Control Systems (ICS), the National Vulnerability Database and ExploitDB. Using RDN-Boost, we were able to reach an AUC ROC of 0.83 and an AUC PR of 0.69 for the problem at hand. To reach that performance, we indicate that it is instrumental to include textual features, e.g., extracted from the description of vulnerabilities, as well as structured information, e.g., about product categories. In addition, using interpretable relational regression trees we report simple rules that shed insight on factors impacting the weaponization of ICS products.