S. Starks, L. Longpré, R. Araiza, V. Kreinovich, Hung T. Nguyen
{"title":"地理信息学中的重复检测:从区间和模糊数到一般多维不确定性","authors":"S. Starks, L. Longpré, R. Araiza, V. Kreinovich, Hung T. Nguyen","doi":"10.1109/NAFIPS.2007.383900","DOIUrl":null,"url":null,"abstract":"Geospatial databases generally consist of measurements related to points (or pixels in the case of raster data), lines, and polygons. In recent years, the size and complexity of these databases have increased significantly and they often contain duplicate records, i.e., two or more close records representing the same measurement result. In this paper, we address the problem of detecting duplicates in a database consisting of point measurements. As a test case, we use a database of measurements of anomalies in the Earth's gravity field that we have compiled. In our previous papers (2003,2004), we have proposed a new fast (O(n ldr log(n))) duplication deletion algorithm for the case when closeness of two points (x1,y1) and (x2,y2) is described as closeness of both coordinates. In this paper, we extend this algorithm to the case when closeness is described by an arbitrary metric. Both algorithms have been successfully applied to gravity databases.","PeriodicalId":292853,"journal":{"name":"NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting Duplicates in Geoinformatics: from Intervals and Fuzzy Numbers to General Multi-D Uncertainty\",\"authors\":\"S. Starks, L. Longpré, R. Araiza, V. Kreinovich, Hung T. Nguyen\",\"doi\":\"10.1109/NAFIPS.2007.383900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Geospatial databases generally consist of measurements related to points (or pixels in the case of raster data), lines, and polygons. In recent years, the size and complexity of these databases have increased significantly and they often contain duplicate records, i.e., two or more close records representing the same measurement result. In this paper, we address the problem of detecting duplicates in a database consisting of point measurements. As a test case, we use a database of measurements of anomalies in the Earth's gravity field that we have compiled. In our previous papers (2003,2004), we have proposed a new fast (O(n ldr log(n))) duplication deletion algorithm for the case when closeness of two points (x1,y1) and (x2,y2) is described as closeness of both coordinates. In this paper, we extend this algorithm to the case when closeness is described by an arbitrary metric. Both algorithms have been successfully applied to gravity databases.\",\"PeriodicalId\":292853,\"journal\":{\"name\":\"NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAFIPS.2007.383900\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2007.383900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Duplicates in Geoinformatics: from Intervals and Fuzzy Numbers to General Multi-D Uncertainty
Geospatial databases generally consist of measurements related to points (or pixels in the case of raster data), lines, and polygons. In recent years, the size and complexity of these databases have increased significantly and they often contain duplicate records, i.e., two or more close records representing the same measurement result. In this paper, we address the problem of detecting duplicates in a database consisting of point measurements. As a test case, we use a database of measurements of anomalies in the Earth's gravity field that we have compiled. In our previous papers (2003,2004), we have proposed a new fast (O(n ldr log(n))) duplication deletion algorithm for the case when closeness of two points (x1,y1) and (x2,y2) is described as closeness of both coordinates. In this paper, we extend this algorithm to the case when closeness is described by an arbitrary metric. Both algorithms have been successfully applied to gravity databases.