{"title":"使用基于词网络的扩展命名实体识别识别识别文本重用","authors":"Eunji Lee, Pankoo Kim","doi":"10.1145/3264746.3264811","DOIUrl":null,"url":null,"abstract":"Text reuse is an unethical practice that has become prominent in information content digitization owing to the spread of the internet and smartphones. One challenge with text reuse is that it can be difficult to detect if there are changes in the word order and words are inserted, deleted, or replaced. To resolve the issue of words being excluded from similarity measurement targets when they are replaced with words having a similar meaning, this paper proposes a method of measuring similarity in which named entity recognition is performed on the words appearing in the target document and named entity tags are annotated to them. However, typical named entity recognition only targets proper nouns, so when common nouns are replaced with similar words, they are not classified as named entities belonging to the same class. To resolve this problem, we have expanded the range of WordNetbased named entity recognition.","PeriodicalId":186790,"journal":{"name":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","volume":"34 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying text reuse using word net-based extended named entity recognition\",\"authors\":\"Eunji Lee, Pankoo Kim\",\"doi\":\"10.1145/3264746.3264811\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text reuse is an unethical practice that has become prominent in information content digitization owing to the spread of the internet and smartphones. One challenge with text reuse is that it can be difficult to detect if there are changes in the word order and words are inserted, deleted, or replaced. To resolve the issue of words being excluded from similarity measurement targets when they are replaced with words having a similar meaning, this paper proposes a method of measuring similarity in which named entity recognition is performed on the words appearing in the target document and named entity tags are annotated to them. However, typical named entity recognition only targets proper nouns, so when common nouns are replaced with similar words, they are not classified as named entities belonging to the same class. To resolve this problem, we have expanded the range of WordNetbased named entity recognition.\",\"PeriodicalId\":186790,\"journal\":{\"name\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"volume\":\"34 10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3264746.3264811\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3264746.3264811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying text reuse using word net-based extended named entity recognition
Text reuse is an unethical practice that has become prominent in information content digitization owing to the spread of the internet and smartphones. One challenge with text reuse is that it can be difficult to detect if there are changes in the word order and words are inserted, deleted, or replaced. To resolve the issue of words being excluded from similarity measurement targets when they are replaced with words having a similar meaning, this paper proposes a method of measuring similarity in which named entity recognition is performed on the words appearing in the target document and named entity tags are annotated to them. However, typical named entity recognition only targets proper nouns, so when common nouns are replaced with similar words, they are not classified as named entities belonging to the same class. To resolve this problem, we have expanded the range of WordNetbased named entity recognition.