{"title":"基于用户嵌入的OpenStreetMap故意破坏检测","authors":"Yinxiao Li, T. J. Anderson, Yiqi Niu","doi":"10.1145/3459637.3482213","DOIUrl":null,"url":null,"abstract":"OpenStreetMap (OSM) is a free and openly-editable database of geographic information. Over the years, OSM has evolved into the world's largest open knowledge base of geospatial data, and protecting OSM from the risk of vandalized and falsified information has become paramount to ensuring its continued success. However, despite the increasing usage of OSM and a wide interest in vandalism detection on open knowledge bases such as Wikipedia and Wikidata, OSM has not attracted as much attention from the research community, partially due to a lack of publicly available vandalism corpus. In this paper, we report on the construction of the first OSM vandalism corpus, and release it publicly. We describe a user embedding approach to create OSM user embeddings and add embedding features to a machine learning model to improve vandalism detection in OSM. We validate the model against our vandalism corpus, and observe solid improvements in key metrics. The validated model is deployed into production for vandalism detection on Daylight Map.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"117 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Vandalism Detection in OpenStreetMap via User Embeddings\",\"authors\":\"Yinxiao Li, T. J. Anderson, Yiqi Niu\",\"doi\":\"10.1145/3459637.3482213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"OpenStreetMap (OSM) is a free and openly-editable database of geographic information. Over the years, OSM has evolved into the world's largest open knowledge base of geospatial data, and protecting OSM from the risk of vandalized and falsified information has become paramount to ensuring its continued success. However, despite the increasing usage of OSM and a wide interest in vandalism detection on open knowledge bases such as Wikipedia and Wikidata, OSM has not attracted as much attention from the research community, partially due to a lack of publicly available vandalism corpus. In this paper, we report on the construction of the first OSM vandalism corpus, and release it publicly. We describe a user embedding approach to create OSM user embeddings and add embedding features to a machine learning model to improve vandalism detection in OSM. We validate the model against our vandalism corpus, and observe solid improvements in key metrics. The validated model is deployed into production for vandalism detection on Daylight Map.\",\"PeriodicalId\":405296,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Information & Knowledge Management\",\"volume\":\"117 11\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Information & Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3459637.3482213\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459637.3482213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Vandalism Detection in OpenStreetMap via User Embeddings
OpenStreetMap (OSM) is a free and openly-editable database of geographic information. Over the years, OSM has evolved into the world's largest open knowledge base of geospatial data, and protecting OSM from the risk of vandalized and falsified information has become paramount to ensuring its continued success. However, despite the increasing usage of OSM and a wide interest in vandalism detection on open knowledge bases such as Wikipedia and Wikidata, OSM has not attracted as much attention from the research community, partially due to a lack of publicly available vandalism corpus. In this paper, we report on the construction of the first OSM vandalism corpus, and release it publicly. We describe a user embedding approach to create OSM user embeddings and add embedding features to a machine learning model to improve vandalism detection in OSM. We validate the model against our vandalism corpus, and observe solid improvements in key metrics. The validated model is deployed into production for vandalism detection on Daylight Map.