基于Triscale的可复制网络实验设计

J. Syst. Res. Pub Date : 2021-09-22 DOI:10.5070/sr31155408
Romain Jacob, Marco Zimmerling, C. Boano, L. Vanbever, L. Thiele
{"title":"基于Triscale的可复制网络实验设计","authors":"Romain Jacob, Marco Zimmerling, C. Boano, L. Vanbever, L. Thiele","doi":"10.5070/sr31155408","DOIUrl":null,"url":null,"abstract":"When designing their performance evaluations, networking researchers often encounter questions such as: How long should a run be? How many runs to perform? How to account for the variability across multiple runs? What statistical methods should be used to analyze the data? Despite their best intentions, researchers often answer these questions dif-ferently, thus impairing the replicability of their evaluations and the confidence in their results. In this paper, we propose a concrete methodology for the design and analysis of performance evaluations. Our approach hierarchically partitions the performance evaluation into three timescales, following the principle of separation of concerns. The idea is to understand, for each timescale, the temporal characteristics of variability sources, and then to apply rigorous statistical methods to derive performance results with quantifiable confidence in spite of the inherent variability. We implement this methodology in a software framework called TriScale . For each performance metric, TriScale computes a variability score that estimates, with a given confidence, how similar the results would be if the evaluation were repli-cated; in other words, TriScale quantifies the replicability of evaluations. We showcase the practicality and usefulness of TriScale on four different case studies demonstrating that TriScale helps to generalize and strengthen published results. Improving the standards of replicability in networking is a complex challenge. This paper is an important contribution to this endeavor; it provides networking researchers with a rational and concrete experimental methodology rooted in sound statistical foundations. The first of its kind.","PeriodicalId":363427,"journal":{"name":"J. Syst. Res.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"[Tool] Designing Replicable Networking Experiments With Triscale\",\"authors\":\"Romain Jacob, Marco Zimmerling, C. Boano, L. Vanbever, L. Thiele\",\"doi\":\"10.5070/sr31155408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When designing their performance evaluations, networking researchers often encounter questions such as: How long should a run be? How many runs to perform? How to account for the variability across multiple runs? What statistical methods should be used to analyze the data? Despite their best intentions, researchers often answer these questions dif-ferently, thus impairing the replicability of their evaluations and the confidence in their results. In this paper, we propose a concrete methodology for the design and analysis of performance evaluations. Our approach hierarchically partitions the performance evaluation into three timescales, following the principle of separation of concerns. The idea is to understand, for each timescale, the temporal characteristics of variability sources, and then to apply rigorous statistical methods to derive performance results with quantifiable confidence in spite of the inherent variability. We implement this methodology in a software framework called TriScale . For each performance metric, TriScale computes a variability score that estimates, with a given confidence, how similar the results would be if the evaluation were repli-cated; in other words, TriScale quantifies the replicability of evaluations. We showcase the practicality and usefulness of TriScale on four different case studies demonstrating that TriScale helps to generalize and strengthen published results. Improving the standards of replicability in networking is a complex challenge. This paper is an important contribution to this endeavor; it provides networking researchers with a rational and concrete experimental methodology rooted in sound statistical foundations. The first of its kind.\",\"PeriodicalId\":363427,\"journal\":{\"name\":\"J. Syst. Res.\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Syst. Res.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5070/sr31155408\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Syst. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5070/sr31155408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在设计他们的绩效评估时,网络研究人员经常遇到这样的问题:跑步应该多长时间?要运行多少次?如何解释多次运行的可变性?应该使用什么统计方法来分析数据?尽管他们的出发点是好的,但研究人员经常以不同的方式回答这些问题,从而损害了他们评估的可重复性和对结果的信心。在本文中,我们提出了一个具体的方法来设计和分析绩效评估。我们的方法遵循关注点分离的原则,将绩效评估分层划分为三个时间尺度。我们的想法是理解,对于每个时间尺度,变异性来源的时间特征,然后应用严格的统计方法,以可量化的信心推导性能结果,尽管固有的变异性。我们在一个叫做TriScale的软件框架中实现了这个方法。对于每个性能指标,TriScale计算一个可变性分数,以给定的置信度估计如果重复评估结果的相似程度;换句话说,TriScale量化了评估的可重复性。我们在四个不同的案例研究中展示了TriScale的实用性和有用性,表明TriScale有助于推广和加强已发表的结果。提高网络中可复制性的标准是一项复杂的挑战。本文是对这一努力的重要贡献;它为网络研究人员提供了一种基于可靠统计基础的合理而具体的实验方法。这是第一次。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
[Tool] Designing Replicable Networking Experiments With Triscale
When designing their performance evaluations, networking researchers often encounter questions such as: How long should a run be? How many runs to perform? How to account for the variability across multiple runs? What statistical methods should be used to analyze the data? Despite their best intentions, researchers often answer these questions dif-ferently, thus impairing the replicability of their evaluations and the confidence in their results. In this paper, we propose a concrete methodology for the design and analysis of performance evaluations. Our approach hierarchically partitions the performance evaluation into three timescales, following the principle of separation of concerns. The idea is to understand, for each timescale, the temporal characteristics of variability sources, and then to apply rigorous statistical methods to derive performance results with quantifiable confidence in spite of the inherent variability. We implement this methodology in a software framework called TriScale . For each performance metric, TriScale computes a variability score that estimates, with a given confidence, how similar the results would be if the evaluation were repli-cated; in other words, TriScale quantifies the replicability of evaluations. We showcase the practicality and usefulness of TriScale on four different case studies demonstrating that TriScale helps to generalize and strengthen published results. Improving the standards of replicability in networking is a complex challenge. This paper is an important contribution to this endeavor; it provides networking researchers with a rational and concrete experimental methodology rooted in sound statistical foundations. The first of its kind.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信