{"title":"源代码抄袭检测系统的性能评价","authors":"","doi":"10.24086/cocos2022/paper.732","DOIUrl":null,"url":null,"abstract":"Plagiarism Detection Systems are particularly useful in identifying plagiarism in the educational sector, where scientific publications and articles are common. Plagiarism occurs when someone replicates a piece of work without permission or citation from the original creator. Because of the advancement of communication and information technologies (ICT) and the accessibility of scientific materials on the internet, plagiarism detection has become a top priority and due to the broad availability of freeware text editors, detecting source code plagiarism has become a big difficulty. There have already been several research on the many forms of plagiarism detection algorithms used in identification systems, as well as source code plagiarism detection. This work suggests a strategy that combines TF-IDF transformations with a Random Forest Classifier to achieve a 93.5 percent accuracy rate, which is high when compared to previous strategies. The suggested system is implemented using the Python programming language.","PeriodicalId":137930,"journal":{"name":"4th International Conference on Communication Engineering and Computer Science (CIC-COCOS’2022)","volume":"281 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of Source Code Plagiarism Detection System\",\"authors\":\"\",\"doi\":\"10.24086/cocos2022/paper.732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Plagiarism Detection Systems are particularly useful in identifying plagiarism in the educational sector, where scientific publications and articles are common. Plagiarism occurs when someone replicates a piece of work without permission or citation from the original creator. Because of the advancement of communication and information technologies (ICT) and the accessibility of scientific materials on the internet, plagiarism detection has become a top priority and due to the broad availability of freeware text editors, detecting source code plagiarism has become a big difficulty. There have already been several research on the many forms of plagiarism detection algorithms used in identification systems, as well as source code plagiarism detection. This work suggests a strategy that combines TF-IDF transformations with a Random Forest Classifier to achieve a 93.5 percent accuracy rate, which is high when compared to previous strategies. The suggested system is implemented using the Python programming language.\",\"PeriodicalId\":137930,\"journal\":{\"name\":\"4th International Conference on Communication Engineering and Computer Science (CIC-COCOS’2022)\",\"volume\":\"281 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"4th International Conference on Communication Engineering and Computer Science (CIC-COCOS’2022)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24086/cocos2022/paper.732\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"4th International Conference on Communication Engineering and Computer Science (CIC-COCOS’2022)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24086/cocos2022/paper.732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Evaluation of Source Code Plagiarism Detection System
Plagiarism Detection Systems are particularly useful in identifying plagiarism in the educational sector, where scientific publications and articles are common. Plagiarism occurs when someone replicates a piece of work without permission or citation from the original creator. Because of the advancement of communication and information technologies (ICT) and the accessibility of scientific materials on the internet, plagiarism detection has become a top priority and due to the broad availability of freeware text editors, detecting source code plagiarism has become a big difficulty. There have already been several research on the many forms of plagiarism detection algorithms used in identification systems, as well as source code plagiarism detection. This work suggests a strategy that combines TF-IDF transformations with a Random Forest Classifier to achieve a 93.5 percent accuracy rate, which is high when compared to previous strategies. The suggested system is implemented using the Python programming language.