Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder
{"title":"我们能在源代码中预测有用的注释吗?-软件工程Track @ FIRE 2022中信息检索结果分析","authors":"Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder","doi":"10.1145/3574318.3574329","DOIUrl":null,"url":null,"abstract":"The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Can we predict useful comments in source codes? - Analysis of findings from Information Retrieval in Software Engineering Track @ FIRE 2022\",\"authors\":\"Srijoni Majumdar, Ayan Bandyopadhyay, P. Das, Paul D. Clough, S. Chattopadhyay, Prasenjit Majumder\",\"doi\":\"10.1145/3574318.3574329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.\",\"PeriodicalId\":270700,\"journal\":{\"name\":\"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3574318.3574329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3574318.3574329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Can we predict useful comments in source codes? - Analysis of findings from Information Retrieval in Software Engineering Track @ FIRE 2022
The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs extracted from open source github C based projects. Overall 34 experiments have been submitted by 11 teams from various universities and software companies. The submissions have been evaluated quantitatively using the F1-Score and qualitatively based on the type of features developed, the supervised learning model used and their corresponding hyper-parameters. The best performing architectures mostly have employed transformer architectures coupled with a software development related embedding space.