J. Wilkie, Ziad Al Halabi, Alperen Karaoglu, Jiafeng Liao, George Ndungu, Chaiyong Ragkhitwetsagul, M. Paixão, J. Krinke
{"title":"这是谁?使用IDE事件数据识别开发人员","authors":"J. Wilkie, Ziad Al Halabi, Alperen Karaoglu, Jiafeng Liao, George Ndungu, Chaiyong Ragkhitwetsagul, M. Paixão, J. Krinke","doi":"10.1145/3196398.3196461","DOIUrl":null,"url":null,"abstract":"This paper presents a technique to identify a developer based on their IDE event data. We exploited the KaVE data set which recorded IDE activities from 85 developers with 11M events. We found that using an SVM with a linear kernel on raw event count outperformed k-NN in identifying developers with an accuracy of 0.52. Moreover, after setting the optimal number of events and sessions to train the classifier, we achieved a higher accuracy of 0.69 and 0.71 respectively. The findings shows that we can identify developers based on their IDE event data. The technique can be expanded further to group similar developers for IDE feature recommendations.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"13 1","pages":"90-93"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Who's This? Developer Identification Using IDE Event Data\",\"authors\":\"J. Wilkie, Ziad Al Halabi, Alperen Karaoglu, Jiafeng Liao, George Ndungu, Chaiyong Ragkhitwetsagul, M. Paixão, J. Krinke\",\"doi\":\"10.1145/3196398.3196461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a technique to identify a developer based on their IDE event data. We exploited the KaVE data set which recorded IDE activities from 85 developers with 11M events. We found that using an SVM with a linear kernel on raw event count outperformed k-NN in identifying developers with an accuracy of 0.52. Moreover, after setting the optimal number of events and sessions to train the classifier, we achieved a higher accuracy of 0.69 and 0.71 respectively. The findings shows that we can identify developers based on their IDE event data. The technique can be expanded further to group similar developers for IDE feature recommendations.\",\"PeriodicalId\":6639,\"journal\":{\"name\":\"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)\",\"volume\":\"13 1\",\"pages\":\"90-93\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3196398.3196461\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3196398.3196461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Who's This? Developer Identification Using IDE Event Data
This paper presents a technique to identify a developer based on their IDE event data. We exploited the KaVE data set which recorded IDE activities from 85 developers with 11M events. We found that using an SVM with a linear kernel on raw event count outperformed k-NN in identifying developers with an accuracy of 0.52. Moreover, after setting the optimal number of events and sessions to train the classifier, we achieved a higher accuracy of 0.69 and 0.71 respectively. The findings shows that we can identify developers based on their IDE event data. The technique can be expanded further to group similar developers for IDE feature recommendations.