我们能在源代码中预测有用的注释吗?-软件工程Track @ FIRE 2022中信息检索结果分析

Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2022-12-09 DOI:10.1145/3574318.3574329

Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder

{"title":"我们能在源代码中预测有用的注释吗?-软件工程Track @ FIRE 2022中信息检索结果分析","authors":"Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder","doi":"10.1145/3574318.3574329","DOIUrl":null,"url":null,"abstract":"The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Can we predict useful comments in source codes? - Analysis of findings from Information Retrieval in Software Engineering Track @ FIRE 2022\",\"authors\":\"Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder\",\"doi\":\"10.1145/3574318.3574329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.\",\"PeriodicalId\":270700,\"journal\":{\"name\":\"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3574318.3574329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3574318.3574329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

软件工程中的信息检索(IRSE)课程旨在为机器学习框架中代码注释的自动评估开发解决方案。在这个轨道中，有一个二元分类任务，将注释分类为有用的和无用的。该数据集由9048个代码注释和从开源github C项目中提取的周围代码片段对组成。来自不同大学和软件公司的11个团队总共提交了34个实验。使用f1评分对提交的内容进行了定量评估，并根据开发的特征类型、使用的监督学习模型及其相应的超参数进行了定性评估。性能最好的体系结构大多采用了与软件开发相关的嵌入空间相结合的变压器体系结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Can we predict useful comments in source codes? - Analysis of findings from Information Retrieval in Software Engineering Track @ FIRE 2022

The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation

自引率

0.00%

发文量