贝叶斯算法是如何调试的

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI:10.1109/ICDM.2006.83

Chao Liu, Zeng Lian, Jiawei Han

{"title":"贝叶斯算法是如何调试的","authors":"Chao Liu, Zeng Lian, Jiawei Han","doi":"10.1109/ICDM.2006.83","DOIUrl":null,"url":null,"abstract":"Manual debugging is expensive. And the high cost has motivated extensive research on automated fault localization in both software engineering and data mining communities. Fault localization aims at automatically locating likely fault locations, and hence assists manual debugging. A number of fault localization algorithms have been developed in recent years, which prove effective when multiple failing and passing cases are available. However, we notice what is more commonly encountered in practice is the two-sample debugging problem, where only one failing and one passing cases are available. This problem has been either overlooked or insufficiently tackled in previous studies. In this paper, we develop a new fault localization algorithm, named BayesDebug, which simulates some manual debugging principles through a Bayesian approach. Different from existing approaches that base fault analysis on multiple passing and failing cases, BayesDebug only requires one passing and one failing cases. We reason about why BayesDebug fits the two- sample debugging problem and why other approaches do not. Finally, an experiment with a real-world program grep-2.2 is conducted, which exemplifies the effectiveness of BayesDebug.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"How Bayesians Debug\",\"authors\":\"Chao Liu, Zeng Lian, Jiawei Han\",\"doi\":\"10.1109/ICDM.2006.83\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Manual debugging is expensive. And the high cost has motivated extensive research on automated fault localization in both software engineering and data mining communities. Fault localization aims at automatically locating likely fault locations, and hence assists manual debugging. A number of fault localization algorithms have been developed in recent years, which prove effective when multiple failing and passing cases are available. However, we notice what is more commonly encountered in practice is the two-sample debugging problem, where only one failing and one passing cases are available. This problem has been either overlooked or insufficiently tackled in previous studies. In this paper, we develop a new fault localization algorithm, named BayesDebug, which simulates some manual debugging principles through a Bayesian approach. Different from existing approaches that base fault analysis on multiple passing and failing cases, BayesDebug only requires one passing and one failing cases. We reason about why BayesDebug fits the two- sample debugging problem and why other approaches do not. Finally, an experiment with a real-world program grep-2.2 is conducted, which exemplifies the effectiveness of BayesDebug.\",\"PeriodicalId\":356443,\"journal\":{\"name\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2006.83\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Data Mining (ICDM'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2006.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

摘要

手动调试是昂贵的。而高成本也促使了软件工程和数据挖掘界对自动故障定位的广泛研究。故障定位旨在自动定位可能的故障位置，从而辅助人工调试。近年来发展了许多故障定位算法，这些算法在存在多个故障和通过的情况下是有效的。然而，我们注意到在实践中更常见的是双样本调试问题，其中只有一个失败和一个通过的情况可用。这一问题在以往的研究中要么被忽视，要么没有得到充分的解决。在本文中，我们开发了一种新的故障定位算法BayesDebug，它通过贝叶斯方法模拟了一些人工调试原理。与现有的基于多个通过和失败案例的故障分析方法不同，BayesDebug只需要一个通过和一个失败案例。我们解释了为什么BayesDebug适合双样本调试问题，而其他方法不适合的原因。最后，对真实世界的grep-2.2程序进行了实验，验证了BayesDebug的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How Bayesians Debug

Manual debugging is expensive. And the high cost has motivated extensive research on automated fault localization in both software engineering and data mining communities. Fault localization aims at automatically locating likely fault locations, and hence assists manual debugging. A number of fault localization algorithms have been developed in recent years, which prove effective when multiple failing and passing cases are available. However, we notice what is more commonly encountered in practice is the two-sample debugging problem, where only one failing and one passing cases are available. This problem has been either overlooked or insufficiently tackled in previous studies. In this paper, we develop a new fault localization algorithm, named BayesDebug, which simulates some manual debugging principles through a Bayesian approach. Different from existing approaches that base fault analysis on multiple passing and failing cases, BayesDebug only requires one passing and one failing cases. We reason about why BayesDebug fits the two- sample debugging problem and why other approaches do not. Finally, an experiment with a real-world program grep-2.2 is conducted, which exemplifies the effectiveness of BayesDebug.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sixth International Conference on Data Mining (ICDM'06)

自引率

0.00%

发文量