{"title":"基于内容的半监督地理语义聚类和层次的地理空间模式匹配","authors":"J. Partyka, L. Khan","doi":"10.1109/ICSC.2011.18","DOIUrl":null,"url":null,"abstract":"The problem of semantic similarity across heterogeneous geospatial data sources continues to attract interest. Semantic similarity across data sources typically involves 1:1 matching of attributes and their instances between tables. Using clustering methods, three distinct challenges remain unaddressed. First, many clustering algorithms rely only on one instance property. Second, a consistent score for an attribute match is not produced. Finally, hierarchical relationships between the data are not considered. To address these, we introduce GeoSim, a tool for determining the semantic similarity between geospatial schemas. GeoSim consists of GeoSimG and GeoSimH. GeoSimG derives clusters from attribute instances based on their geographic and semantic properties. It examines attribute instances in the clusters to calculate a consistent semantic similarity score through entropy-based distribution (EBD). GeoSimH also captures hierarchical relationships between compared tables and attributes. Results from experiments involving multi-jurisdictional geospatial datasets show that GeoSim outperforms several popular semantic similarity approaches.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Content-Based Geospatial Schema Matching Using Semi-supervised Geosemantic Clustering and Hierarchy\",\"authors\":\"J. Partyka, L. Khan\",\"doi\":\"10.1109/ICSC.2011.18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The problem of semantic similarity across heterogeneous geospatial data sources continues to attract interest. Semantic similarity across data sources typically involves 1:1 matching of attributes and their instances between tables. Using clustering methods, three distinct challenges remain unaddressed. First, many clustering algorithms rely only on one instance property. Second, a consistent score for an attribute match is not produced. Finally, hierarchical relationships between the data are not considered. To address these, we introduce GeoSim, a tool for determining the semantic similarity between geospatial schemas. GeoSim consists of GeoSimG and GeoSimH. GeoSimG derives clusters from attribute instances based on their geographic and semantic properties. It examines attribute instances in the clusters to calculate a consistent semantic similarity score through entropy-based distribution (EBD). GeoSimH also captures hierarchical relationships between compared tables and attributes. Results from experiments involving multi-jurisdictional geospatial datasets show that GeoSim outperforms several popular semantic similarity approaches.\",\"PeriodicalId\":408382,\"journal\":{\"name\":\"2011 IEEE Fifth International Conference on Semantic Computing\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Fifth International Conference on Semantic Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSC.2011.18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Fifth International Conference on Semantic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSC.2011.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Content-Based Geospatial Schema Matching Using Semi-supervised Geosemantic Clustering and Hierarchy
The problem of semantic similarity across heterogeneous geospatial data sources continues to attract interest. Semantic similarity across data sources typically involves 1:1 matching of attributes and their instances between tables. Using clustering methods, three distinct challenges remain unaddressed. First, many clustering algorithms rely only on one instance property. Second, a consistent score for an attribute match is not produced. Finally, hierarchical relationships between the data are not considered. To address these, we introduce GeoSim, a tool for determining the semantic similarity between geospatial schemas. GeoSim consists of GeoSimG and GeoSimH. GeoSimG derives clusters from attribute instances based on their geographic and semantic properties. It examines attribute instances in the clusters to calculate a consistent semantic similarity score through entropy-based distribution (EBD). GeoSimH also captures hierarchical relationships between compared tables and attributes. Results from experiments involving multi-jurisdictional geospatial datasets show that GeoSim outperforms several popular semantic similarity approaches.