M. Golzadeh, Alexandre Decan, Natarajan Chidambaram
{"title":"机器人检测技术的准确性研究","authors":"M. Golzadeh, Alexandre Decan, Natarajan Chidambaram","doi":"10.1145/3528228.3528406","DOIUrl":null,"url":null,"abstract":"Development bots are often used to automate a wide variety of repetitive tasks in collaborative software development. Such bots are commonly among the most active project contributors in terms of commit activity. As such, tools that analyse contributor activity (e.g., for recognizing and giving credit to project members for their contributions) need to take into account the bots and exclude their activity. While there are a few techniques to detect bots in software repositories, these techniques are not perfect and may miss some bots or may wrongly identify some human accounts as bots. In this paper, we present an exploratory study on the accuracy of bot detection techniques on a set of 540 accounts from 27 GitHub projects. We show that none of the bot detection techniques are accurate enough to detect bots among the 20 most active contributors of each project. We show that combining these techniques drastically increases the accuracy and recall of bot detection. We also highlight the importance of considering bots when attributing contributions to humans, since bots are prevalent among the top contributors and responsible for large proportions of commits.","PeriodicalId":431263,"journal":{"name":"2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"On the Accuracy of Bot Detection Techniques\",\"authors\":\"M. Golzadeh, Alexandre Decan, Natarajan Chidambaram\",\"doi\":\"10.1145/3528228.3528406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Development bots are often used to automate a wide variety of repetitive tasks in collaborative software development. Such bots are commonly among the most active project contributors in terms of commit activity. As such, tools that analyse contributor activity (e.g., for recognizing and giving credit to project members for their contributions) need to take into account the bots and exclude their activity. While there are a few techniques to detect bots in software repositories, these techniques are not perfect and may miss some bots or may wrongly identify some human accounts as bots. In this paper, we present an exploratory study on the accuracy of bot detection techniques on a set of 540 accounts from 27 GitHub projects. We show that none of the bot detection techniques are accurate enough to detect bots among the 20 most active contributors of each project. We show that combining these techniques drastically increases the accuracy and recall of bot detection. We also highlight the importance of considering bots when attributing contributions to humans, since bots are prevalent among the top contributors and responsible for large proportions of commits.\",\"PeriodicalId\":431263,\"journal\":{\"name\":\"2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3528228.3528406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3528228.3528406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Development bots are often used to automate a wide variety of repetitive tasks in collaborative software development. Such bots are commonly among the most active project contributors in terms of commit activity. As such, tools that analyse contributor activity (e.g., for recognizing and giving credit to project members for their contributions) need to take into account the bots and exclude their activity. While there are a few techniques to detect bots in software repositories, these techniques are not perfect and may miss some bots or may wrongly identify some human accounts as bots. In this paper, we present an exploratory study on the accuracy of bot detection techniques on a set of 540 accounts from 27 GitHub projects. We show that none of the bot detection techniques are accurate enough to detect bots among the 20 most active contributors of each project. We show that combining these techniques drastically increases the accuracy and recall of bot detection. We also highlight the importance of considering bots when attributing contributions to humans, since bots are prevalent among the top contributors and responsible for large proportions of commits.