Automatic bi-modal question title generation for Stack Overflow with prompt learning

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering Pub Date : 2024-05-03 DOI:10.1007/s10664-024-10466-4

Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu

{"title":"Automatic bi-modal question title generation for Stack Overflow with prompt learning","authors":"Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu","doi":"10.1007/s10664-024-10466-4","DOIUrl":null,"url":null,"abstract":"When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. However, this study ignored the helpful information in their corresponding problem descriptions. Therefore, we propose an approach SOTitle+ by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Then we formalize the title generation for different programming languages as separate but related tasks and utilize multi-task learning to solve these tasks. Later we fine-tune the pre-trained language model CodeT5 to automatically generate the titles. Unfortunately, the inconsistent inputs and optimization objectives between the pre-training task and our investigated task may make fine-tuning hard to fully explore the knowledge of the pre-trained model. To solve this issue, SOTitle+ further prompt-tunes CodeT5 with hybrid prompts (i.e., mixture of hard and soft prompts). To verify the effectiveness of SOTitle+, we construct a large-scale high-quality corpus from recent data dumps shared by Stack Overflow. Our corpus includes 179,119 high-quality question posts for six popular programming languages. Experimental results show that SOTitle+ can significantly outperform four state-of-the-art baselines in both automatic evaluation and human evaluation. In addition, our ablation studies also confirm the effectiveness of component settings (such as bi-modal information, prompt learning, hybrid prompts, and multi-task learning) of SOTitle+. Our work indicates that considering bi-modal information and prompt learning in Stack Overflow title generation is a promising exploration direction.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"20 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10466-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. However, this study ignored the helpful information in their corresponding problem descriptions. Therefore, we propose an approach SOTitle+ by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Then we formalize the title generation for different programming languages as separate but related tasks and utilize multi-task learning to solve these tasks. Later we fine-tune the pre-trained language model CodeT5 to automatically generate the titles. Unfortunately, the inconsistent inputs and optimization objectives between the pre-training task and our investigated task may make fine-tuning hard to fully explore the knowledge of the pre-trained model. To solve this issue, SOTitle+ further prompt-tunes CodeT5 with hybrid prompts (i.e., mixture of hard and soft prompts). To verify the effectiveness of SOTitle+, we construct a large-scale high-quality corpus from recent data dumps shared by Stack Overflow. Our corpus includes 179,119 high-quality question posts for six popular programming languages. Experimental results show that SOTitle+ can significantly outperform four state-of-the-art baselines in both automatic evaluation and human evaluation. In addition, our ablation studies also confirm the effectiveness of component settings (such as bi-modal information, prompt learning, hybrid prompts, and multi-task learning) of SOTitle+. Our work indicates that considering bi-modal information and prompt learning in Stack Overflow title generation is a promising exploration direction.

Abstract Image

查看原文本刊更多论文

利用提示学习为 Stack Overflow 自动生成双模问题标题

在为 Stack Overflow 起草问题帖子时，开发人员可能无法在问题标题中准确概括核心问题，从而导致这些问题无法得到及时帮助。因此，提高问题标题的质量引起了研究人员的广泛关注。最初的一项研究旨在仅通过分析问题正文中的代码片段来自动生成标题。但是，这项研究忽略了相应问题描述中的有用信息。因此，我们提出了一种 SOTitle+ 方法，即考虑问题正文中的双模信息（即代码片段和问题描述）。然后，我们将不同编程语言的标题生成形式化为独立但相关的任务，并利用多任务学习来解决这些任务。之后，我们对预先训练好的语言模型 CodeT5 进行微调，以自动生成标题。遗憾的是，由于预训练任务和我们研究的任务之间的输入和优化目标不一致，微调可能难以充分挖掘预训练模型的知识。为了解决这个问题，SOTitle+ 使用混合提示（即硬提示和软提示的混合）对 CodeT5 进行了进一步的提示调整。为了验证 SOTitle+ 的有效性，我们从 Stack Overflow 最近共享的数据转储中构建了一个大规模高质量语料库。我们的语料库包括六种流行编程语言的 179,119 条高质量问题帖子。实验结果表明，SOTitle+ 在自动评估和人工评估中的表现都明显优于四种最先进的基线。此外，我们的消融研究还证实了 SOTitle+ 的组件设置（如双模信息、提示学习、混合提示和多任务学习）的有效性。我们的工作表明，在 Stack Overflow 标题生成中考虑双模信息和提示学习是一个很有前景的探索方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.