{"title":"身份解决:23年的实践经验和大规模观察","authors":"Jeff Jonas","doi":"10.1145/1142473.1142556","DOIUrl":null,"url":null,"abstract":"Identity Resolution is a semantic reconciliation activity as applied to people and organizations. Identity resolution is most frequently quantified in terms of accuracy (false positives and false negatives), however, there are additional metrics by which to evaluate identity resolution algorithms including: methodology, persistence, streaming versus batch, data survivorship, operationalizing historical data, transaction/window size, ingestion speed, end-to-end latency, sequence neutrality, handling of ambiguous conditions, reconcilability, scalability, sustainability, and operational characteristics at scale. As well, a technique for \"analytics in the anonymized data space\" will be presented that makes it possible to resolve identities in a more privacy-preserving manner.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Identity resolution: 23 years of practical experience and observations at scale\",\"authors\":\"Jeff Jonas\",\"doi\":\"10.1145/1142473.1142556\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identity Resolution is a semantic reconciliation activity as applied to people and organizations. Identity resolution is most frequently quantified in terms of accuracy (false positives and false negatives), however, there are additional metrics by which to evaluate identity resolution algorithms including: methodology, persistence, streaming versus batch, data survivorship, operationalizing historical data, transaction/window size, ingestion speed, end-to-end latency, sequence neutrality, handling of ambiguous conditions, reconcilability, scalability, sustainability, and operational characteristics at scale. As well, a technique for \\\"analytics in the anonymized data space\\\" will be presented that makes it possible to resolve identities in a more privacy-preserving manner.\",\"PeriodicalId\":416090,\"journal\":{\"name\":\"Proceedings of the 2006 ACM SIGMOD international conference on Management of data\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2006 ACM SIGMOD international conference on Management of data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1142473.1142556\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1142473.1142556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identity resolution: 23 years of practical experience and observations at scale
Identity Resolution is a semantic reconciliation activity as applied to people and organizations. Identity resolution is most frequently quantified in terms of accuracy (false positives and false negatives), however, there are additional metrics by which to evaluate identity resolution algorithms including: methodology, persistence, streaming versus batch, data survivorship, operationalizing historical data, transaction/window size, ingestion speed, end-to-end latency, sequence neutrality, handling of ambiguous conditions, reconcilability, scalability, sustainability, and operational characteristics at scale. As well, a technique for "analytics in the anonymized data space" will be presented that makes it possible to resolve identities in a more privacy-preserving manner.