{"title":"TagClass:一个通过增量解析从大量恶意软件标签中提取类别确定标签的工具","authors":"Y. Jiang, Gaolei Li, Shenghong Li","doi":"10.1109/DSN58367.2023.00029","DOIUrl":null,"url":null,"abstract":"VirusTotal is widely used for malware annotation by providing malware labels from a large set of anti-malware engines. A long-standing challenge in using these inconsistent labels is extracting class-determined tags. In this paper, we present Tagclass,a tool based on incremental parsing to associate tags with their corresponding family, behavior, and platform classes. Tagclasstreats behavior and platform tags as locators and achieves incremental parsing by introducing and iterating the following two algorithms: 1) location first search, which hits family tags using locators, and 2) co-occurrence first search, which finds new locators by family tags. Experiments across two benchmark datasets indicate Tagclassoutperforms existing methods, improving the parsing accuracy by 21% and 28%, respectively. To the best of our knowledge, Tagclassis the first tag class-determined malware label parsing tool, which would pave the way for research on crowdsourcing malware annotation. Tagclasshas been released to the community 11https://github.com/crowdma/tagclass.","PeriodicalId":427725,"journal":{"name":"2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TagClass: A Tool for Extracting Class-Determined Tags from Massive Malware Labels via Incremental Parsing\",\"authors\":\"Y. Jiang, Gaolei Li, Shenghong Li\",\"doi\":\"10.1109/DSN58367.2023.00029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"VirusTotal is widely used for malware annotation by providing malware labels from a large set of anti-malware engines. A long-standing challenge in using these inconsistent labels is extracting class-determined tags. In this paper, we present Tagclass,a tool based on incremental parsing to associate tags with their corresponding family, behavior, and platform classes. Tagclasstreats behavior and platform tags as locators and achieves incremental parsing by introducing and iterating the following two algorithms: 1) location first search, which hits family tags using locators, and 2) co-occurrence first search, which finds new locators by family tags. Experiments across two benchmark datasets indicate Tagclassoutperforms existing methods, improving the parsing accuracy by 21% and 28%, respectively. To the best of our knowledge, Tagclassis the first tag class-determined malware label parsing tool, which would pave the way for research on crowdsourcing malware annotation. Tagclasshas been released to the community 11https://github.com/crowdma/tagclass.\",\"PeriodicalId\":427725,\"journal\":{\"name\":\"2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSN58367.2023.00029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN58367.2023.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TagClass: A Tool for Extracting Class-Determined Tags from Massive Malware Labels via Incremental Parsing
VirusTotal is widely used for malware annotation by providing malware labels from a large set of anti-malware engines. A long-standing challenge in using these inconsistent labels is extracting class-determined tags. In this paper, we present Tagclass,a tool based on incremental parsing to associate tags with their corresponding family, behavior, and platform classes. Tagclasstreats behavior and platform tags as locators and achieves incremental parsing by introducing and iterating the following two algorithms: 1) location first search, which hits family tags using locators, and 2) co-occurrence first search, which finds new locators by family tags. Experiments across two benchmark datasets indicate Tagclassoutperforms existing methods, improving the parsing accuracy by 21% and 28%, respectively. To the best of our knowledge, Tagclassis the first tag class-determined malware label parsing tool, which would pave the way for research on crowdsourcing malware annotation. Tagclasshas been released to the community 11https://github.com/crowdma/tagclass.