Finding Critical Files from a Packet

Junnyung Hur, Hahoon Jeon, Hyeon Gy Shon, Young Jae Kim, Myungkeun Yoon
{"title":"Finding Critical Files from a Packet","authors":"Junnyung Hur, Hahoon Jeon, Hyeon Gy Shon, Young Jae Kim, Myungkeun Yoon","doi":"10.1109/INFOCOM42981.2021.9488914","DOIUrl":null,"url":null,"abstract":"Network-based intrusion detection and data leakage prevention systems inspect packets to detect if critical files such as malware or confidential documents are transferred. However, this kind of detection requires heavy computing resources in reassembling packets and only well-known protocols can be interpreted. Besides, finding similar files from a storage requires pairwise comparisons. In this paper, we present a new network-based file identification scheme that inspects packets independently without reassembly and finds similar files through inverted indexing instead of pairwise comparison. We use a contents-based chunking algorithm to consistently divide both files and packets into multiple byte sequences, called chunks. If a packet is a part of a file, they would have common chunks. The challenging problem is that packet chunking and inverted-index search should be fast and scalable enough for packet processing. The file identification should be accurate although many chunks are noises. In this paper, we use a small Bloom filter and a delayed query strategy to solve the problems. To the best of our knowledge, this is the first scheme that identifies a specific critical file from a packet over unknown protocols. Experimental results show that the proposed scheme can successfully identify a critical file from a packet.","PeriodicalId":293079,"journal":{"name":"IEEE INFOCOM 2021 - IEEE Conference on Computer Communications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2021 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM42981.2021.9488914","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Network-based intrusion detection and data leakage prevention systems inspect packets to detect if critical files such as malware or confidential documents are transferred. However, this kind of detection requires heavy computing resources in reassembling packets and only well-known protocols can be interpreted. Besides, finding similar files from a storage requires pairwise comparisons. In this paper, we present a new network-based file identification scheme that inspects packets independently without reassembly and finds similar files through inverted indexing instead of pairwise comparison. We use a contents-based chunking algorithm to consistently divide both files and packets into multiple byte sequences, called chunks. If a packet is a part of a file, they would have common chunks. The challenging problem is that packet chunking and inverted-index search should be fast and scalable enough for packet processing. The file identification should be accurate although many chunks are noises. In this paper, we use a small Bloom filter and a delayed query strategy to solve the problems. To the best of our knowledge, this is the first scheme that identifies a specific critical file from a packet over unknown protocols. Experimental results show that the proposed scheme can successfully identify a critical file from a packet.
从数据包中查找关键文件
基于网络的入侵检测和数据泄漏防御系统通过检测报文是否传输了恶意软件或机密文件等重要文件。但是,这种检测方式在重组报文时需要大量的计算资源,并且只能解释已知的协议。此外,从存储中查找相似的文件需要两两比较。在本文中,我们提出了一种新的基于网络的文件识别方案,该方案独立检测数据包而不重组,并通过倒排索引而不是两两比较来查找相似的文件。我们使用基于内容的分块算法将文件和数据包一致地划分为多个字节序列,称为块。如果一个包是文件的一部分,那么它们将具有共同的块。具有挑战性的问题是,分组和倒排索引搜索对于分组处理来说应该足够快速和可扩展。文件识别应该是准确的,尽管许多块是噪声。在本文中,我们使用一个小的布隆过滤器和延迟查询策略来解决这个问题。据我们所知,这是第一个通过未知协议从数据包中识别特定关键文件的方案。实验结果表明,该方法能够成功地从数据包中识别出关键文件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信