{"title":"基于深度暹罗网络的自动抄袭检测模型","authors":"Jing Zhang, Siyuan Xue, Jierui Li, Jian She","doi":"10.1109/ccis57298.2022.10016354","DOIUrl":null,"url":null,"abstract":"This paper presents a novel deep Siamese network for automatic plagiarism detection. Our model utilizes a large-scale pre-trained model BERT (bidirectional encoder representations from transformers) to represent the text as word vector, and uses Bi-LSTM (bidirectional long short-term memory) net works to obtain the contextual semantic features of the text, and designs a text semantic interaction me chanism to obtain the interactive semantic features. Our model uses Siamese network to uniformly map matched text pairs into the same parameter matrix s pace. Meanwhile, our model uses multi-head self-attention to fuse text pair vectors for accurate semantic alignment and similarity measures. The experiment al results show that the effect of this model can identify and detect plagiarized text.","PeriodicalId":374660,"journal":{"name":"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Plagiarism Detection Model Based On Deep Siamese Network\",\"authors\":\"Jing Zhang, Siyuan Xue, Jierui Li, Jian She\",\"doi\":\"10.1109/ccis57298.2022.10016354\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel deep Siamese network for automatic plagiarism detection. Our model utilizes a large-scale pre-trained model BERT (bidirectional encoder representations from transformers) to represent the text as word vector, and uses Bi-LSTM (bidirectional long short-term memory) net works to obtain the contextual semantic features of the text, and designs a text semantic interaction me chanism to obtain the interactive semantic features. Our model uses Siamese network to uniformly map matched text pairs into the same parameter matrix s pace. Meanwhile, our model uses multi-head self-attention to fuse text pair vectors for accurate semantic alignment and similarity measures. The experiment al results show that the effect of this model can identify and detect plagiarized text.\",\"PeriodicalId\":374660,\"journal\":{\"name\":\"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ccis57298.2022.10016354\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ccis57298.2022.10016354","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automated Plagiarism Detection Model Based On Deep Siamese Network
This paper presents a novel deep Siamese network for automatic plagiarism detection. Our model utilizes a large-scale pre-trained model BERT (bidirectional encoder representations from transformers) to represent the text as word vector, and uses Bi-LSTM (bidirectional long short-term memory) net works to obtain the contextual semantic features of the text, and designs a text semantic interaction me chanism to obtain the interactive semantic features. Our model uses Siamese network to uniformly map matched text pairs into the same parameter matrix s pace. Meanwhile, our model uses multi-head self-attention to fuse text pair vectors for accurate semantic alignment and similarity measures. The experiment al results show that the effect of this model can identify and detect plagiarized text.