{"title":"利用语义依赖检测异常语义Web数据","authors":"Yang Yu, Yingjie Li, J. Heflin","doi":"10.1109/ICSC.2011.81","DOIUrl":null,"url":null,"abstract":"Data quality is a critical problem for the Semantic Web. We propose that the degree to which a triple deviates from similar triples can be an important heuristic for identifying errors. Inspired by data dependency, which has shown promise in database data quality research, we introduce Semantic Dependency to assess quality of Semantic Web data. The system first builds a summary graph for finding candidate semantic dependencies. Each semantic dependency has a probability according to its instantiations and is subsequently adjusted based on the inconsistencies among them. Then triples can get a posterior probability of normality based on what semantic dependencies can support each of them. Repeating the iteration above, the proposed approach detects abnormal Semantic Web data. Experiments have shown that the system is efficient on data set with 10M triples and has more than a ten percent F-score improvement over our previous system.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Detecting Abnormal Semantic Web Data Using Semantic Dependency\",\"authors\":\"Yang Yu, Yingjie Li, J. Heflin\",\"doi\":\"10.1109/ICSC.2011.81\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data quality is a critical problem for the Semantic Web. We propose that the degree to which a triple deviates from similar triples can be an important heuristic for identifying errors. Inspired by data dependency, which has shown promise in database data quality research, we introduce Semantic Dependency to assess quality of Semantic Web data. The system first builds a summary graph for finding candidate semantic dependencies. Each semantic dependency has a probability according to its instantiations and is subsequently adjusted based on the inconsistencies among them. Then triples can get a posterior probability of normality based on what semantic dependencies can support each of them. Repeating the iteration above, the proposed approach detects abnormal Semantic Web data. Experiments have shown that the system is efficient on data set with 10M triples and has more than a ten percent F-score improvement over our previous system.\",\"PeriodicalId\":408382,\"journal\":{\"name\":\"2011 IEEE Fifth International Conference on Semantic Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Fifth International Conference on Semantic Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSC.2011.81\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Fifth International Conference on Semantic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSC.2011.81","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Abnormal Semantic Web Data Using Semantic Dependency
Data quality is a critical problem for the Semantic Web. We propose that the degree to which a triple deviates from similar triples can be an important heuristic for identifying errors. Inspired by data dependency, which has shown promise in database data quality research, we introduce Semantic Dependency to assess quality of Semantic Web data. The system first builds a summary graph for finding candidate semantic dependencies. Each semantic dependency has a probability according to its instantiations and is subsequently adjusted based on the inconsistencies among them. Then triples can get a posterior probability of normality based on what semantic dependencies can support each of them. Repeating the iteration above, the proposed approach detects abnormal Semantic Web data. Experiments have shown that the system is efficient on data set with 10M triples and has more than a ten percent F-score improvement over our previous system.