Extracting Meaningful Issue–Solution Pair From Collaborative Developer Live Chats

IF 5.7 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Reliability Pub Date : 2025-04-01 DOI:10.1109/TR.2025.3550412

Jiawen Shen;Shikai Guo;Longfeng Chen;Chen Wu;Hui Li;Chenchen Li

{"title":"Extracting Meaningful Issue–Solution Pair From Collaborative Developer Live Chats","authors":"Jiawen Shen;Shikai Guo;Longfeng Chen;Chen Wu;Hui Li;Chenchen Li","doi":"10.1109/TR.2025.3550412","DOIUrl":null,"url":null,"abstract":"The live chats of developers often contain meaningful information in the form of issue–solution pairs. The issue–solution pairs can offer helpful references to others who seek solutions for the similar issues, which can improve software development efficiency by facilitating issue solving. However, previous approaches such as ISPY still struggle with unsatisfactory extraction accuracy, due to the entanglement and complexity of issue-solution pairs' feature information. To address these challenges, we propose an approach named <italic>IS-Hunter</i> for mining issue-solution pairs from real-time chat data. Specifically, <italic>IS-Hunter</i> consists of four main components: the data preprocessing component disentangles and denoises raw chat logs, the utterance embedding component embeds utterances into vectors that subsequent components can easily process, the feature extraction component obtains textual, heuristic, and contextual feature that determines whether an utterance is topic-relevant, and the issue–solution pair prediction component predicts the utterance whether is an issue or a solution. The experimental results show that the performance of IS-Hunter outperforms the baseline methods in issue-detection and solution-extraction in terms of Precision, Recall, and F1-score. Compared with baseline methods, in issue-detection, IS-Hunter, respectively, achieves an average precision, recall, and F1-score of 0.74, 0.74, and 0.74, and it marks an obvious 4.23% improvement over the state-of-the-art approaches. Simultaneously, in solution-extraction, IS-Hunter achieves an average precision, recall, and F1-score of 0.83, 0.90, and 0.86 which is 4.88% higher than the best baseline methods.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3600-3614"},"PeriodicalIF":5.7000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10945766/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The live chats of developers often contain meaningful information in the form of issue–solution pairs. The issue–solution pairs can offer helpful references to others who seek solutions for the similar issues, which can improve software development efficiency by facilitating issue solving. However, previous approaches such as ISPY still struggle with unsatisfactory extraction accuracy, due to the entanglement and complexity of issue-solution pairs' feature information. To address these challenges, we propose an approach named IS-Hunter for mining issue-solution pairs from real-time chat data. Specifically, IS-Hunter consists of four main components: the data preprocessing component disentangles and denoises raw chat logs, the utterance embedding component embeds utterances into vectors that subsequent components can easily process, the feature extraction component obtains textual, heuristic, and contextual feature that determines whether an utterance is topic-relevant, and the issue–solution pair prediction component predicts the utterance whether is an issue or a solution. The experimental results show that the performance of IS-Hunter outperforms the baseline methods in issue-detection and solution-extraction in terms of Precision, Recall, and F1-score. Compared with baseline methods, in issue-detection, IS-Hunter, respectively, achieves an average precision, recall, and F1-score of 0.74, 0.74, and 0.74, and it marks an obvious 4.23% improvement over the state-of-the-art approaches. Simultaneously, in solution-extraction, IS-Hunter achieves an average precision, recall, and F1-score of 0.83, 0.90, and 0.86 which is 4.88% higher than the best baseline methods.

查看原文本刊更多论文

从协作开发人员实时聊天中提取有意义的问题-解决方案对

开发人员的实时聊天通常以问题-解决方案对的形式包含有意义的信息。问题-解决对可以为其他寻求类似问题解决方案的人提供有益的参考，从而通过促进问题的解决来提高软件开发效率。然而，由于问题-解决方案对特征信息的纠缠性和复杂性，先前的方法如ISPY仍然难以达到令人满意的提取精度。为了应对这些挑战，我们提出了一种名为IS-Hunter的方法，用于从实时聊天数据中挖掘问题-解决方案对。具体来说，is - hunter由四个主要组件组成：数据预处理组件对原始聊天日志进行解卷积和去噪；话语嵌入组件将话语嵌入到后续组件可以轻松处理的向量中；特征提取组件获取文本、启发式和上下文特征，确定话语是否与主题相关；问题-解决方案对预测组件预测话语是问题还是解决方案。实验结果表明，IS-Hunter在问题检测和解决方案提取方面的精度、召回率和f1分数均优于基线方法。与基线方法相比，在问题检测方面，IS-Hunter的平均准确率、召回率和f1得分分别为0.74、0.74和0.74，比目前最先进的方法提高了4.23%。同时，在溶液提取方面，is - hunter的平均精密度、召回率和f1分数分别为0.83、0.90和0.86，比最佳基线方法提高了4.88%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Reliability 工程技术-工程：电子与电气

CiteScore

12.20

自引率

8.50%

发文量

153

审稿时长

7.5 months

期刊介绍： IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.