{"title":"使用向量空间模型的自动外部波斯语抄袭检测","authors":"P. Mahdavi, Zahra Siadati, F. Yaghmaee","doi":"10.1109/ICCKE.2014.6993398","DOIUrl":null,"url":null,"abstract":"Nowadays, extremely wide and facilitated access to the Internet has made the plagiarism and text reuse more common. Many studies have been conducted on automatic plagiarism detection. But there are few studies on automatic Persian plagiarism detection methods due to lack of a suitable Persian corpus. In this paper, an external Persian plagiarism detection method based on the vector space model (VSM) has been proposed. To implement and examine this method, a Persian corpus has been developed. Several optimizations have been done during the study. These optimizations make the algorithm very fast and accurate. The test results of the proposed method shows an accuracy of 0.87 and a processing time cost of less than 10 minutes.","PeriodicalId":152540,"journal":{"name":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Automatic external Persian plagiarism detection using vector space model\",\"authors\":\"P. Mahdavi, Zahra Siadati, F. Yaghmaee\",\"doi\":\"10.1109/ICCKE.2014.6993398\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, extremely wide and facilitated access to the Internet has made the plagiarism and text reuse more common. Many studies have been conducted on automatic plagiarism detection. But there are few studies on automatic Persian plagiarism detection methods due to lack of a suitable Persian corpus. In this paper, an external Persian plagiarism detection method based on the vector space model (VSM) has been proposed. To implement and examine this method, a Persian corpus has been developed. Several optimizations have been done during the study. These optimizations make the algorithm very fast and accurate. The test results of the proposed method shows an accuracy of 0.87 and a processing time cost of less than 10 minutes.\",\"PeriodicalId\":152540,\"journal\":{\"name\":\"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"148 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE.2014.6993398\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2014.6993398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic external Persian plagiarism detection using vector space model
Nowadays, extremely wide and facilitated access to the Internet has made the plagiarism and text reuse more common. Many studies have been conducted on automatic plagiarism detection. But there are few studies on automatic Persian plagiarism detection methods due to lack of a suitable Persian corpus. In this paper, an external Persian plagiarism detection method based on the vector space model (VSM) has been proposed. To implement and examine this method, a Persian corpus has been developed. Several optimizations have been done during the study. These optimizations make the algorithm very fast and accurate. The test results of the proposed method shows an accuracy of 0.87 and a processing time cost of less than 10 minutes.