Yangrui Yang, Yaping Zhu, Sisi Chen, Pengpeng Jian
{"title":"API比较知识的快速语言模型提取","authors":"Yangrui Yang, Yaping Zhu, Sisi Chen, Pengpeng Jian","doi":"10.1016/j.cola.2023.101200","DOIUrl":null,"url":null,"abstract":"<div><p>Application Programming Interfaces (APIs) are frequent in software engineering domain texts, such as API references and Stack Overflow. These APIs and the comparison knowledge between them are not only important for solving programming issues (e.g., question answering), but they are also organized into structured knowledge to support many software engineering tasks (e.g., API misuse detection). As a result, extracting API comparison knowledge (API entities and semantic relations) from texts is essential. Existing rule-based and sequence labeling-based approaches must manually enumerate all linguistic patterns or label a large amount of data. Therefore, they involve a significant labor overhead and are exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, we formulates heterogeneous API extraction and API relation extraction tasks as a sequence-to-sequence generation task. It proposes APICKnow, an API entity-relation joint extraction model based on the large language model. To improve our model’s performance and quick learning ability, we adopt the prompt learning method to stimulate APICKnow to recognize API entities and relations. We systematically evaluate APICKnow on a set of sentences from Stack Overflow. The experimental results show that APICKnow can outperform the state-of-the-art baselines, and APICKnow has a quick learning ability and strong generalization ability.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"75 ","pages":"Article 101200"},"PeriodicalIF":1.7000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"API comparison knowledge extraction via prompt-tuned language model\",\"authors\":\"Yangrui Yang, Yaping Zhu, Sisi Chen, Pengpeng Jian\",\"doi\":\"10.1016/j.cola.2023.101200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Application Programming Interfaces (APIs) are frequent in software engineering domain texts, such as API references and Stack Overflow. These APIs and the comparison knowledge between them are not only important for solving programming issues (e.g., question answering), but they are also organized into structured knowledge to support many software engineering tasks (e.g., API misuse detection). As a result, extracting API comparison knowledge (API entities and semantic relations) from texts is essential. Existing rule-based and sequence labeling-based approaches must manually enumerate all linguistic patterns or label a large amount of data. Therefore, they involve a significant labor overhead and are exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, we formulates heterogeneous API extraction and API relation extraction tasks as a sequence-to-sequence generation task. It proposes APICKnow, an API entity-relation joint extraction model based on the large language model. To improve our model’s performance and quick learning ability, we adopt the prompt learning method to stimulate APICKnow to recognize API entities and relations. We systematically evaluate APICKnow on a set of sentences from Stack Overflow. The experimental results show that APICKnow can outperform the state-of-the-art baselines, and APICKnow has a quick learning ability and strong generalization ability.</p></div>\",\"PeriodicalId\":48552,\"journal\":{\"name\":\"Journal of Computer Languages\",\"volume\":\"75 \",\"pages\":\"Article 101200\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Languages\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590118423000102\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118423000102","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
API comparison knowledge extraction via prompt-tuned language model
Application Programming Interfaces (APIs) are frequent in software engineering domain texts, such as API references and Stack Overflow. These APIs and the comparison knowledge between them are not only important for solving programming issues (e.g., question answering), but they are also organized into structured knowledge to support many software engineering tasks (e.g., API misuse detection). As a result, extracting API comparison knowledge (API entities and semantic relations) from texts is essential. Existing rule-based and sequence labeling-based approaches must manually enumerate all linguistic patterns or label a large amount of data. Therefore, they involve a significant labor overhead and are exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, we formulates heterogeneous API extraction and API relation extraction tasks as a sequence-to-sequence generation task. It proposes APICKnow, an API entity-relation joint extraction model based on the large language model. To improve our model’s performance and quick learning ability, we adopt the prompt learning method to stimulate APICKnow to recognize API entities and relations. We systematically evaluate APICKnow on a set of sentences from Stack Overflow. The experimental results show that APICKnow can outperform the state-of-the-art baselines, and APICKnow has a quick learning ability and strong generalization ability.