{"title":"A Token-based Illicit Copy Detection Method Using Complexity for a Program Exercise","authors":"Mai Iwamoto, S. Oshima, T. Nakashima","doi":"10.1109/BWCCA.2013.100","DOIUrl":null,"url":null,"abstract":"The conducts to copy using other person's source codes and submit as reports are regarded as a problem for program exercises of programming subjects in universities or colleges. An automatic detection algorithm to detect illicit copies is required in these educational organizations. In previous researches, these methods based on the detection standard of the token length have been proposed. These methods use the threshold simply using the character length. In these cases, miss detections occur in the case of the simple program such as the sequence of the print statement or the case that token sequences appear in the middle of a statement. This paper proposes the detection method using the program complexity and the complete token sequence. As the results of experiments, our method can improve the recall R adopting the complexity as the detection standard and the precision P adopting the complete token sequence for exercise programs submitted by students.","PeriodicalId":227978,"journal":{"name":"2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BWCCA.2013.100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The conducts to copy using other person's source codes and submit as reports are regarded as a problem for program exercises of programming subjects in universities or colleges. An automatic detection algorithm to detect illicit copies is required in these educational organizations. In previous researches, these methods based on the detection standard of the token length have been proposed. These methods use the threshold simply using the character length. In these cases, miss detections occur in the case of the simple program such as the sequence of the print statement or the case that token sequences appear in the middle of a statement. This paper proposes the detection method using the program complexity and the complete token sequence. As the results of experiments, our method can improve the recall R adopting the complexity as the detection standard and the precision P adopting the complete token sequence for exercise programs submitted by students.