改进故障聚类测试覆盖率的词法表示

2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW) Pub Date : 2021-11-01 DOI:10.1109/ASEW52652.2021.00052

Juyeon Yoon, S. Yoo

{"title":"改进故障聚类测试覆盖率的词法表示","authors":"Juyeon Yoon, S. Yoo","doi":"10.1109/ASEW52652.2021.00052","DOIUrl":null,"url":null,"abstract":"Failure clustering aims to group multiple test failures based on shared root causes, helping developers to comprehend and debug each root cause (i.e., the underlying fault) in isolation. Clustering of failing test executions requires distances between those executions, for which distance measures between coverage vectors are widely used. Lexical representation of coverage has been suggested as an alternative, representing each structural element covered by a failing execution with the lexical tokens in the element. This paper investigates whether the granularity of the lexical representation affects the effectiveness of the failure clustering. We evaluate varying levels of tokenisation granularity by using them for clustering coexisting real-world test failures in Defects4J benchmark. Our results show that the traditionally adopted subtokenisation can actually deconstruct larger meaningful semantic token units, resulting in suboptimal clustering.","PeriodicalId":349977,"journal":{"name":"2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Enhancing Lexical Representation of Test Coverage for Failure Clustering\",\"authors\":\"Juyeon Yoon, S. Yoo\",\"doi\":\"10.1109/ASEW52652.2021.00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Failure clustering aims to group multiple test failures based on shared root causes, helping developers to comprehend and debug each root cause (i.e., the underlying fault) in isolation. Clustering of failing test executions requires distances between those executions, for which distance measures between coverage vectors are widely used. Lexical representation of coverage has been suggested as an alternative, representing each structural element covered by a failing execution with the lexical tokens in the element. This paper investigates whether the granularity of the lexical representation affects the effectiveness of the failure clustering. We evaluate varying levels of tokenisation granularity by using them for clustering coexisting real-world test failures in Defects4J benchmark. Our results show that the traditionally adopted subtokenisation can actually deconstruct larger meaningful semantic token units, resulting in suboptimal clustering.\",\"PeriodicalId\":349977,\"journal\":{\"name\":\"2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASEW52652.2021.00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASEW52652.2021.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

故障集群的目的是基于共享的根本原因对多个测试故障进行分组，帮助开发人员孤立地理解和调试每个根本原因(即潜在的故障)。失败的测试执行的聚类需要这些执行之间的距离，而覆盖向量之间的距离度量被广泛使用。覆盖率的词法表示被建议作为一种替代方法，用元素中的词法记号表示执行失败所覆盖的每个结构元素。本文研究了词法表示的粒度是否会影响故障聚类的有效性。我们通过使用不同级别的标记化粒度来对缺陷4j基准中共存的实际测试失败进行聚类，从而评估不同级别的标记化粒度。我们的结果表明，传统采用的子标记化实际上可以解构更大的有意义的语义标记单元，导致次优聚类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing Lexical Representation of Test Coverage for Failure Clustering

Failure clustering aims to group multiple test failures based on shared root causes, helping developers to comprehend and debug each root cause (i.e., the underlying fault) in isolation. Clustering of failing test executions requires distances between those executions, for which distance measures between coverage vectors are widely used. Lexical representation of coverage has been suggested as an alternative, representing each structural element covered by a failing execution with the lexical tokens in the element. This paper investigates whether the granularity of the lexical representation affects the effectiveness of the failure clustering. We evaluate varying levels of tokenisation granularity by using them for clustering coexisting real-world test failures in Defects4J benchmark. Our results show that the traditionally adopted subtokenisation can actually deconstruct larger meaningful semantic token units, resulting in suboptimal clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)

自引率

0.00%

发文量