Comparison of parallel central processing unit‐ and graphics processing unit‐based implementations of greedy string tiling algorithm for source code plagiarism detection
{"title":"Comparison of parallel central processing unit‐ and graphics processing unit‐based implementations of greedy string tiling algorithm for source code plagiarism detection","authors":"M. Mišić, M. Tomasevic","doi":"10.1002/cpe.7135","DOIUrl":null,"url":null,"abstract":"Massive‐enrollment computing courses often involve some practical training through programming assignments and projects that are frequent targets for plagiarism. Source code similarity detection tools are used to prevent such misbehavior. Parallel processing has recently become a viable technique for speeding up the processing of large workloads. This article examines the parallelization of a source code similarity detection method based on the greedy string tiling and Karp–Rabin algorithms. Both CPU and GPU parallelization approaches are discussed. The CPU implementation uses Pthreads, whereas the GPU implementation employs CUDA. Depending on the evaluated dataset which consists of real student assignment codes, speedups of up to seven times over the sequential version of the code are achieved. Evaluation results on both platforms are compared and discussed in detail.","PeriodicalId":10584,"journal":{"name":"Concurrency and Computation: Practice and Experience","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation: Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/cpe.7135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Massive‐enrollment computing courses often involve some practical training through programming assignments and projects that are frequent targets for plagiarism. Source code similarity detection tools are used to prevent such misbehavior. Parallel processing has recently become a viable technique for speeding up the processing of large workloads. This article examines the parallelization of a source code similarity detection method based on the greedy string tiling and Karp–Rabin algorithms. Both CPU and GPU parallelization approaches are discussed. The CPU implementation uses Pthreads, whereas the GPU implementation employs CUDA. Depending on the evaluated dataset which consists of real student assignment codes, speedups of up to seven times over the sequential version of the code are achieved. Evaluation results on both platforms are compared and discussed in detail.