{"title":"利用域名特征向量间距离检测未知DGAs","authors":"Ji Huan, Yongzheng Zhang, Peng Chang, Yupeng Tuo","doi":"10.1109/ICT52184.2021.9511517","DOIUrl":null,"url":null,"abstract":"Many botnets adopt domain generation algorithms (DGAs) to set up stealthy Command & Control (C2) communication. A DGA generates a great number of domain names and the attacker selects some of them to map to the C2 servers. In this paper, we propose Talos, a DGA detection approach to detect unknown DGAs and also known DGAs accurately. The key insight of Talos is that domain names can be represented by feature vectors satisfying the condition that distances between the feature vectors can reflect whether they are of the same class. Talos uses a neural language model to extract the feature vector of a domain name. After that, Talos determines if the feature vector belongs to a class based on whether it is within the boundary of the class and near the centroid of the class. We evaluate the detection ability of Talos on both unknown and known DGAs. Our experimental results show that Talos achieves recall over 92% on unknown classes and F1-score over 95% on known classes. We also compare Talos with state-of-the-art detection approaches and find that Talos's ability to detect unknown DGAs largely surpasses them.","PeriodicalId":142681,"journal":{"name":"2021 28th International Conference on Telecommunications (ICT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Detecting Unknown DGAs Using Distances Between Feature Vectors of Domain Names\",\"authors\":\"Ji Huan, Yongzheng Zhang, Peng Chang, Yupeng Tuo\",\"doi\":\"10.1109/ICT52184.2021.9511517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many botnets adopt domain generation algorithms (DGAs) to set up stealthy Command & Control (C2) communication. A DGA generates a great number of domain names and the attacker selects some of them to map to the C2 servers. In this paper, we propose Talos, a DGA detection approach to detect unknown DGAs and also known DGAs accurately. The key insight of Talos is that domain names can be represented by feature vectors satisfying the condition that distances between the feature vectors can reflect whether they are of the same class. Talos uses a neural language model to extract the feature vector of a domain name. After that, Talos determines if the feature vector belongs to a class based on whether it is within the boundary of the class and near the centroid of the class. We evaluate the detection ability of Talos on both unknown and known DGAs. Our experimental results show that Talos achieves recall over 92% on unknown classes and F1-score over 95% on known classes. We also compare Talos with state-of-the-art detection approaches and find that Talos's ability to detect unknown DGAs largely surpasses them.\",\"PeriodicalId\":142681,\"journal\":{\"name\":\"2021 28th International Conference on Telecommunications (ICT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 28th International Conference on Telecommunications (ICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICT52184.2021.9511517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 28th International Conference on Telecommunications (ICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICT52184.2021.9511517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Unknown DGAs Using Distances Between Feature Vectors of Domain Names
Many botnets adopt domain generation algorithms (DGAs) to set up stealthy Command & Control (C2) communication. A DGA generates a great number of domain names and the attacker selects some of them to map to the C2 servers. In this paper, we propose Talos, a DGA detection approach to detect unknown DGAs and also known DGAs accurately. The key insight of Talos is that domain names can be represented by feature vectors satisfying the condition that distances between the feature vectors can reflect whether they are of the same class. Talos uses a neural language model to extract the feature vector of a domain name. After that, Talos determines if the feature vector belongs to a class based on whether it is within the boundary of the class and near the centroid of the class. We evaluate the detection ability of Talos on both unknown and known DGAs. Our experimental results show that Talos achieves recall over 92% on unknown classes and F1-score over 95% on known classes. We also compare Talos with state-of-the-art detection approaches and find that Talos's ability to detect unknown DGAs largely surpasses them.