{"title":"GPThreats-3: Is Automatic Malware Generation a Threat?","authors":"Marcus Botacin","doi":"10.1109/SPW59333.2023.00027","DOIUrl":null,"url":null,"abstract":"Recent research advances introduced large textual models, of which GPT-3 is state-of-the-art. They enable many applications, such as generating text and code. Whereas the model's capabilities might be explored for good, they might also cause some negative impact: The model's code generation capabilities might be used by attackers to assist in malware creation, a phenomenon that must be understood. In this work, our goal is to answer the question: Can current large textual models (represented by GPT-3) already be used by attackers to generate malware? If so: How can attackers use these models? We explore multiple coding strategies, ranging from the entire mal ware description to separate descriptions of mal ware functions that can be used as building blocks. We also test the model's ability to rewrite malware code in multiple manners. Our experiments show that GPT-3 still has trouble generating entire malware samples from complete descriptions but that it can easily construct malware via building block descriptions. It also still has limitations to understand the described contexts, but once it is done it generates multiple versions of the same semantic (malware variants), whose detection rate significantly varies (from 4 to 55 Virustotal AV s).","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW59333.2023.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Recent research advances introduced large textual models, of which GPT-3 is state-of-the-art. They enable many applications, such as generating text and code. Whereas the model's capabilities might be explored for good, they might also cause some negative impact: The model's code generation capabilities might be used by attackers to assist in malware creation, a phenomenon that must be understood. In this work, our goal is to answer the question: Can current large textual models (represented by GPT-3) already be used by attackers to generate malware? If so: How can attackers use these models? We explore multiple coding strategies, ranging from the entire mal ware description to separate descriptions of mal ware functions that can be used as building blocks. We also test the model's ability to rewrite malware code in multiple manners. Our experiments show that GPT-3 still has trouble generating entire malware samples from complete descriptions but that it can easily construct malware via building block descriptions. It also still has limitations to understand the described contexts, but once it is done it generates multiple versions of the same semantic (malware variants), whose detection rate significantly varies (from 4 to 55 Virustotal AV s).