{"title":"Distance indices for the detection of similarity in C programs","authors":"J. Baby, T. Kannan, P. Vinod, V. Gopal","doi":"10.1109/ICCPEIC.2014.6915408","DOIUrl":null,"url":null,"abstract":"There has been proliferation in the use of plagiarized articles or source code amongst student and research community. This paper focus on an efficient method that can differentiate between plagiarized and non-plagiarized programs. Similarity/Distance measurement techniques are used to classify the test file. Thirty six distance metrics are used to determine intra class and inter class proximity. Unseen file not used for frequency extraction are predicted with higher accuracy. This depict that our proposed model using intra/inter family threshold can be implemented to identify plagiarized programs with better detection rate.","PeriodicalId":176197,"journal":{"name":"2014 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCPEIC.2014.6915408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
There has been proliferation in the use of plagiarized articles or source code amongst student and research community. This paper focus on an efficient method that can differentiate between plagiarized and non-plagiarized programs. Similarity/Distance measurement techniques are used to classify the test file. Thirty six distance metrics are used to determine intra class and inter class proximity. Unseen file not used for frequency extraction are predicted with higher accuracy. This depict that our proposed model using intra/inter family threshold can be implemented to identify plagiarized programs with better detection rate.