来自真实世界软件产品线的基于搜索的多样化采样

Yi Xiang, Han Huang, Yuren Zhou, Sizhe Li, Chuan Luo, Qingwei Lin, Miqing Li, Xiaowei Yang
{"title":"来自真实世界软件产品线的基于搜索的多样化采样","authors":"Yi Xiang, Han Huang, Yuren Zhou, Sizhe Li, Chuan Luo, Qingwei Lin, Miqing Li, Xiaowei Yang","doi":"10.1145/3510003.3510053","DOIUrl":null,"url":null,"abstract":"Real-world software product lines (SPLs) often encompass enormous valid configurations that are impossible to enumerate. To understand properties of the space formed by all valid configurations, a feasible way is to select a small and valid sample set. Even though a number of sampling strategies have been proposed, they either fail to produce diverse samples with respect to the number of selected features (an important property to characterize behaviors of configurations), or achieve diverse sampling but with limited scalability (the handleable configuration space size is limited to 1013). To resolve this dilemma, we propose a scalable diverse sampling strategy, which uses a distance metric in combination with the novelty search algorithm to produce diverse samples in an incremental way. The distance metric is carefully designed to measure similarities between configurations, and further diversity of a sample set. The novelty search incrementally improves diversity of samples through the search for novel configurations. We evaluate our sampling algorithm on 39 real-world SPLs. It is able to generate the required number of samples for all the SPLs, including those which cannot be counted by sharpSAT, a state-of-the-art model counting solver. Moreover, it performs better than or at least competitively to state-of-the-art samplers regarding diversity of the sample set. Experimental results suggest that only the proposed sampler (among all the tested ones) achieves scalable diverse sampling.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Search-based Diverse Sampling from Real-world Software Product Lines\",\"authors\":\"Yi Xiang, Han Huang, Yuren Zhou, Sizhe Li, Chuan Luo, Qingwei Lin, Miqing Li, Xiaowei Yang\",\"doi\":\"10.1145/3510003.3510053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Real-world software product lines (SPLs) often encompass enormous valid configurations that are impossible to enumerate. To understand properties of the space formed by all valid configurations, a feasible way is to select a small and valid sample set. Even though a number of sampling strategies have been proposed, they either fail to produce diverse samples with respect to the number of selected features (an important property to characterize behaviors of configurations), or achieve diverse sampling but with limited scalability (the handleable configuration space size is limited to 1013). To resolve this dilemma, we propose a scalable diverse sampling strategy, which uses a distance metric in combination with the novelty search algorithm to produce diverse samples in an incremental way. The distance metric is carefully designed to measure similarities between configurations, and further diversity of a sample set. The novelty search incrementally improves diversity of samples through the search for novel configurations. We evaluate our sampling algorithm on 39 real-world SPLs. It is able to generate the required number of samples for all the SPLs, including those which cannot be counted by sharpSAT, a state-of-the-art model counting solver. Moreover, it performs better than or at least competitively to state-of-the-art samplers regarding diversity of the sample set. Experimental results suggest that only the proposed sampler (among all the tested ones) achieves scalable diverse sampling.\",\"PeriodicalId\":202896,\"journal\":{\"name\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510003.3510053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

现实世界的软件产品线(SPLs)通常包含大量无法枚举的有效配置。为了了解所有有效构型构成的空间的性质,一种可行的方法是选择一个小而有效的样本集。尽管已经提出了许多采样策略,但它们要么无法根据所选特征的数量(表征配置行为的重要属性)产生不同的样本,要么实现了不同的采样,但具有有限的可扩展性(可处理的配置空间大小限制为1013)。为了解决这一难题,我们提出了一种可扩展的多样化采样策略,该策略将距离度量与新颖性搜索算法相结合,以增量的方式产生多样化的样本。距离度量是精心设计的,用于测量配置之间的相似性,以及样本集的进一步多样性。新颖性搜索通过搜索新配置逐步提高样本的多样性。我们在39个真实的SPLs上评估了我们的抽样算法。它能够生成所有SPLs所需的样本数量,包括那些不能被sharpSAT计算的样本,一个最先进的模型计数求解器。此外,在样本集的多样性方面,它比最先进的采样器表现得更好,或者至少与之竞争。实验结果表明,在所有被测试的采样器中,只有提出的采样器实现了可扩展的多样化采样。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Search-based Diverse Sampling from Real-world Software Product Lines
Real-world software product lines (SPLs) often encompass enormous valid configurations that are impossible to enumerate. To understand properties of the space formed by all valid configurations, a feasible way is to select a small and valid sample set. Even though a number of sampling strategies have been proposed, they either fail to produce diverse samples with respect to the number of selected features (an important property to characterize behaviors of configurations), or achieve diverse sampling but with limited scalability (the handleable configuration space size is limited to 1013). To resolve this dilemma, we propose a scalable diverse sampling strategy, which uses a distance metric in combination with the novelty search algorithm to produce diverse samples in an incremental way. The distance metric is carefully designed to measure similarities between configurations, and further diversity of a sample set. The novelty search incrementally improves diversity of samples through the search for novel configurations. We evaluate our sampling algorithm on 39 real-world SPLs. It is able to generate the required number of samples for all the SPLs, including those which cannot be counted by sharpSAT, a state-of-the-art model counting solver. Moreover, it performs better than or at least competitively to state-of-the-art samplers regarding diversity of the sample set. Experimental results suggest that only the proposed sampler (among all the tested ones) achieves scalable diverse sampling.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信