{"title":"利用知识图谱对非结构化文本中的位置信息进行匿名化","authors":"Taisho Sasada, Yuzo Taenaka, Y. Kadobayashi","doi":"10.1145/3428757.3429195","DOIUrl":null,"url":null,"abstract":"There is a growing need to anonymize data as new businesses are increasingly utilizing vast amount of unstructured text. Also, unstructured text have a risk of personal location estimation by considering location information. Nevertheless, existing generalizations do not take into location information and therefore cannot robustly handle this attack. In this study, we proposed anonymizing location information in unstructured text using knowledge graph newly constructed from an actual geographic information system. Our method has the advantages of anonymization, taking into account actual geographic information, handling abbreviations and spelling inconsistencies, and allowing for dynamic graph updates. The results of the evaluation experiments show that anonymization is more robust than existing methods against location estimation attacks without compromising its usefulness as a dataset. Also, we found that the names of organizations and places with a high probability of occurrence in unstructured text are more likely to lead to personal identification.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Anonymizing Location Information in Unstructured Text Using Knowledge Graph\",\"authors\":\"Taisho Sasada, Yuzo Taenaka, Y. Kadobayashi\",\"doi\":\"10.1145/3428757.3429195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a growing need to anonymize data as new businesses are increasingly utilizing vast amount of unstructured text. Also, unstructured text have a risk of personal location estimation by considering location information. Nevertheless, existing generalizations do not take into location information and therefore cannot robustly handle this attack. In this study, we proposed anonymizing location information in unstructured text using knowledge graph newly constructed from an actual geographic information system. Our method has the advantages of anonymization, taking into account actual geographic information, handling abbreviations and spelling inconsistencies, and allowing for dynamic graph updates. The results of the evaluation experiments show that anonymization is more robust than existing methods against location estimation attacks without compromising its usefulness as a dataset. Also, we found that the names of organizations and places with a high probability of occurrence in unstructured text are more likely to lead to personal identification.\",\"PeriodicalId\":212557,\"journal\":{\"name\":\"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3428757.3429195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3428757.3429195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Anonymizing Location Information in Unstructured Text Using Knowledge Graph
There is a growing need to anonymize data as new businesses are increasingly utilizing vast amount of unstructured text. Also, unstructured text have a risk of personal location estimation by considering location information. Nevertheless, existing generalizations do not take into location information and therefore cannot robustly handle this attack. In this study, we proposed anonymizing location information in unstructured text using knowledge graph newly constructed from an actual geographic information system. Our method has the advantages of anonymization, taking into account actual geographic information, handling abbreviations and spelling inconsistencies, and allowing for dynamic graph updates. The results of the evaluation experiments show that anonymization is more robust than existing methods against location estimation attacks without compromising its usefulness as a dataset. Also, we found that the names of organizations and places with a high probability of occurrence in unstructured text are more likely to lead to personal identification.