基于半标记跟踪数据集的可证明网络流量测量与分析

Milan Cermák, Tomás Jirsík, P. Velan, Jana Komárková, Stanislav Špaček, Martin Drasar, Tomáš Plesník
{"title":"基于半标记跟踪数据集的可证明网络流量测量与分析","authors":"Milan Cermák, Tomás Jirsík, P. Velan, Jana Komárková, Stanislav Špaček, Martin Drasar, Tomáš Plesník","doi":"10.23919/TMA.2018.8506498","DOIUrl":null,"url":null,"abstract":"Research in network traffic measurement and analysis is a long-lasting field with growing interest from both scientists and the industry. However, even after so many years, results replication, criticism, and review are still rare. We face not only a lack of research standards, but also inaccessibility of appropriate datasets that can be used for methods development and evaluation. Therefore, a lot of potentially high-quality research cannot be verified and is not adopted by the industry or the community. The aim of this paper is to overcome this controversy with a unique solution based on a combination of distinct approaches proposed by other research works. Unlike these studies, we focus on the whole issue covering all areas of data anonymization, authenticity, recency, publicity, and their usage for research provability. We believe that these challenges can be solved by utilization of semi-labeled datasets composed of real-world network traffic and annotated units with interest-related packet traces only. In this paper, we outline the basic ideas of the methodology from unit trace collection and semi-labeled dataset creation to its usage for research evaluation. We strive for this proposal to start a discussion of the approach and help to overcome some of the challenges the research faces today.","PeriodicalId":6607,"journal":{"name":"2018 Network Traffic Measurement and Analysis Conference (TMA)","volume":"48 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Towards Provable Network Traffic Measurement and Analysis via Semi-Labeled Trace Datasets\",\"authors\":\"Milan Cermák, Tomás Jirsík, P. Velan, Jana Komárková, Stanislav Špaček, Martin Drasar, Tomáš Plesník\",\"doi\":\"10.23919/TMA.2018.8506498\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research in network traffic measurement and analysis is a long-lasting field with growing interest from both scientists and the industry. However, even after so many years, results replication, criticism, and review are still rare. We face not only a lack of research standards, but also inaccessibility of appropriate datasets that can be used for methods development and evaluation. Therefore, a lot of potentially high-quality research cannot be verified and is not adopted by the industry or the community. The aim of this paper is to overcome this controversy with a unique solution based on a combination of distinct approaches proposed by other research works. Unlike these studies, we focus on the whole issue covering all areas of data anonymization, authenticity, recency, publicity, and their usage for research provability. We believe that these challenges can be solved by utilization of semi-labeled datasets composed of real-world network traffic and annotated units with interest-related packet traces only. In this paper, we outline the basic ideas of the methodology from unit trace collection and semi-labeled dataset creation to its usage for research evaluation. We strive for this proposal to start a discussion of the approach and help to overcome some of the challenges the research faces today.\",\"PeriodicalId\":6607,\"journal\":{\"name\":\"2018 Network Traffic Measurement and Analysis Conference (TMA)\",\"volume\":\"48 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Network Traffic Measurement and Analysis Conference (TMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/TMA.2018.8506498\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Network Traffic Measurement and Analysis Conference (TMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/TMA.2018.8506498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

网络流量测量和分析研究是一个长期存在的领域,越来越受到科学家和业界的关注。然而,即使经过了这么多年,结果的复制、批评和审查仍然很少。我们不仅面临缺乏研究标准的问题,而且还面临无法获得可用于方法开发和评估的适当数据集的问题。因此,许多潜在的高质量研究无法得到验证,也没有被行业或社区采用。本文的目的是克服这一争议与独特的解决方案基于不同的方法提出了其他研究工作的组合。与这些研究不同,我们关注的是整个问题,涵盖了数据匿名化、真实性、近代性、公共性及其用于研究可证明性的所有领域。我们相信这些挑战可以通过利用半标记数据集来解决,这些数据集由现实世界的网络流量和仅带有兴趣相关数据包跟踪的注释单元组成。在本文中,我们概述了该方法的基本思想,从单位跟踪收集和半标记数据集创建到其用于研究评估。我们希望这一提议能够引发对该方法的讨论,并帮助克服该研究今天面临的一些挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Provable Network Traffic Measurement and Analysis via Semi-Labeled Trace Datasets
Research in network traffic measurement and analysis is a long-lasting field with growing interest from both scientists and the industry. However, even after so many years, results replication, criticism, and review are still rare. We face not only a lack of research standards, but also inaccessibility of appropriate datasets that can be used for methods development and evaluation. Therefore, a lot of potentially high-quality research cannot be verified and is not adopted by the industry or the community. The aim of this paper is to overcome this controversy with a unique solution based on a combination of distinct approaches proposed by other research works. Unlike these studies, we focus on the whole issue covering all areas of data anonymization, authenticity, recency, publicity, and their usage for research provability. We believe that these challenges can be solved by utilization of semi-labeled datasets composed of real-world network traffic and annotated units with interest-related packet traces only. In this paper, we outline the basic ideas of the methodology from unit trace collection and semi-labeled dataset creation to its usage for research evaluation. We strive for this proposal to start a discussion of the approach and help to overcome some of the challenges the research faces today.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信