{"title":"Control-Flow based Anomaly Detection in the Bug-Fixing Process of Open-Source Projects","authors":"Veena Saini, Paramvir Singh, A. Sureka","doi":"10.1145/3385032.3385038","DOIUrl":null,"url":null,"abstract":"In the past few years, substantial research has been conducted to find out the anomalies present in the real-world business processes. Existing research either uses process mining techniques or discrete sequence-based anomaly detection techniques. The bug-fixing process of various open-source projects has been analyzed previously to discover the process inefficiencies using process mining techniques. These works exploit generic process mining tools to create the process models. Also, they did not evaluate the performance of their proposed conformance checking algorithms. In addition to these, the discrete sequence-based analogy and anomaly detection techniques are not discussed in the bug-fixing process context. In this paper, we report a bug-fixing process dataset for 30 Apache open-source projects that use JIRA bug tracking system for bug reporting. This real-world dataset is analyzed to discover the anomalous process sequences and the root cause of anomalies. The contributions of this paper include (i) a formalized approach for pre-processing and transforming the bug report history data, from bug tracking systems into event logs, suitable for process analysis; (ii) a process mining based anomaly detection framework for bug-fixing processes that comprises our proposed algorithms for process discovery and conformance checking; and (iii) an artificial labelled process dataset available at Mendeley open-source dataset repository ( doi:10.17632/5yb2xv93w3.1).","PeriodicalId":382901,"journal":{"name":"Proceedings of the 13th Innovations in Software Engineering Conference on Formerly known as India Software Engineering Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th Innovations in Software Engineering Conference on Formerly known as India Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3385032.3385038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In the past few years, substantial research has been conducted to find out the anomalies present in the real-world business processes. Existing research either uses process mining techniques or discrete sequence-based anomaly detection techniques. The bug-fixing process of various open-source projects has been analyzed previously to discover the process inefficiencies using process mining techniques. These works exploit generic process mining tools to create the process models. Also, they did not evaluate the performance of their proposed conformance checking algorithms. In addition to these, the discrete sequence-based analogy and anomaly detection techniques are not discussed in the bug-fixing process context. In this paper, we report a bug-fixing process dataset for 30 Apache open-source projects that use JIRA bug tracking system for bug reporting. This real-world dataset is analyzed to discover the anomalous process sequences and the root cause of anomalies. The contributions of this paper include (i) a formalized approach for pre-processing and transforming the bug report history data, from bug tracking systems into event logs, suitable for process analysis; (ii) a process mining based anomaly detection framework for bug-fixing processes that comprises our proposed algorithms for process discovery and conformance checking; and (iii) an artificial labelled process dataset available at Mendeley open-source dataset repository ( doi:10.17632/5yb2xv93w3.1).