Large-scale classification of IPv6-IPv4 siblings with variable clock skew

Quirin Scheitle, Oliver Gasser, Minoo Rouhi, G. Carle
{"title":"Large-scale classification of IPv6-IPv4 siblings with variable clock skew","authors":"Quirin Scheitle, Oliver Gasser, Minoo Rouhi, G. Carle","doi":"10.23919/TMA.2017.8002901","DOIUrl":null,"url":null,"abstract":"Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test performance. We validate scalability by classifying 149k siblings in a large-scale measurement of 371k sibling candidates. We argue that this methodology, thoroughly cross-validated and likely to generalize well, can aid comparative studies of IPv6 and IPv4 behavior in the Internet. Striving for applicability and replicability, we release ready-to-use source code and raw data from our study.","PeriodicalId":118082,"journal":{"name":"2017 Network Traffic Measurement and Analysis Conference (TMA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Network Traffic Measurement and Analysis Conference (TMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/TMA.2017.8002901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

Abstract

Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test performance. We validate scalability by classifying 149k siblings in a large-scale measurement of 371k sibling candidates. We argue that this methodology, thoroughly cross-validated and likely to generalize well, can aid comparative studies of IPv6 and IPv4 behavior in the Internet. Striving for applicability and replicability, we release ready-to-use source code and raw data from our study.
具有可变时钟偏差的IPv6-IPv4兄弟姐妹的大规模分类
将不断增长的IPv6部署与现有的IPv4地址连接起来是一个有趣的研究领域,无论是用于网络取证、结构分析还是侦察。在这项工作中,我们专注于将服务器IPv6和IPv4地址对分类为兄弟地址,即在同一台机器上运行。我们的方法利用TCP时间戳和其他网络特征的主动测量,我们根据682台主机的不同基础事实进行测量。我们定义并提取了一组特征,包括对可变(相对于恒定)远程时钟偏差的估计。在这些特征上,我们训练了一个人工制作的算法以及一个机器学习的决策树。通过在交叉验证轮中进行多次测量和训练,我们的目标是创建泛化良好且不会过度拟合训练数据的模型。我们发现两个模型在训练和测试性能上的精度都超过99%。我们通过在371k候选兄弟姐妹的大规模测量中对149k兄弟姐妹进行分类来验证可扩展性。我们认为,这种方法,经过彻底的交叉验证,可能很好地概括,可以帮助比较研究IPv6和IPv4在互联网上的行为。为了争取适用性和可复制性,我们发布了现成的源代码和我们研究的原始数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信