Jiawen Shen;Shikai Guo;Longfeng Chen;Chen Wu;Hui Li;Chenchen Li
{"title":"Extracting Meaningful Issue–Solution Pair From Collaborative Developer Live Chats","authors":"Jiawen Shen;Shikai Guo;Longfeng Chen;Chen Wu;Hui Li;Chenchen Li","doi":"10.1109/TR.2025.3550412","DOIUrl":null,"url":null,"abstract":"The live chats of developers often contain meaningful information in the form of issue–solution pairs. The issue–solution pairs can offer helpful references to others who seek solutions for the similar issues, which can improve software development efficiency by facilitating issue solving. However, previous approaches such as ISPY still struggle with unsatisfactory extraction accuracy, due to the entanglement and complexity of issue-solution pairs' feature information. To address these challenges, we propose an approach named <italic>IS-Hunter</i> for mining issue-solution pairs from real-time chat data. Specifically, <italic>IS-Hunter</i> consists of four main components: the data preprocessing component disentangles and denoises raw chat logs, the utterance embedding component embeds utterances into vectors that subsequent components can easily process, the feature extraction component obtains textual, heuristic, and contextual feature that determines whether an utterance is topic-relevant, and the issue–solution pair prediction component predicts the utterance whether is an issue or a solution. The experimental results show that the performance of IS-Hunter outperforms the baseline methods in issue-detection and solution-extraction in terms of Precision, Recall, and F1-score. Compared with baseline methods, in issue-detection, IS-Hunter, respectively, achieves an average precision, recall, and F1-score of 0.74, 0.74, and 0.74, and it marks an obvious 4.23% improvement over the state-of-the-art approaches. Simultaneously, in solution-extraction, IS-Hunter achieves an average precision, recall, and F1-score of 0.83, 0.90, and 0.86 which is 4.88% higher than the best baseline methods.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3600-3614"},"PeriodicalIF":5.7000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10945766/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The live chats of developers often contain meaningful information in the form of issue–solution pairs. The issue–solution pairs can offer helpful references to others who seek solutions for the similar issues, which can improve software development efficiency by facilitating issue solving. However, previous approaches such as ISPY still struggle with unsatisfactory extraction accuracy, due to the entanglement and complexity of issue-solution pairs' feature information. To address these challenges, we propose an approach named IS-Hunter for mining issue-solution pairs from real-time chat data. Specifically, IS-Hunter consists of four main components: the data preprocessing component disentangles and denoises raw chat logs, the utterance embedding component embeds utterances into vectors that subsequent components can easily process, the feature extraction component obtains textual, heuristic, and contextual feature that determines whether an utterance is topic-relevant, and the issue–solution pair prediction component predicts the utterance whether is an issue or a solution. The experimental results show that the performance of IS-Hunter outperforms the baseline methods in issue-detection and solution-extraction in terms of Precision, Recall, and F1-score. Compared with baseline methods, in issue-detection, IS-Hunter, respectively, achieves an average precision, recall, and F1-score of 0.74, 0.74, and 0.74, and it marks an obvious 4.23% improvement over the state-of-the-art approaches. Simultaneously, in solution-extraction, IS-Hunter achieves an average precision, recall, and F1-score of 0.83, 0.90, and 0.86 which is 4.88% higher than the best baseline methods.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.