Mining robust neighborhoods for quality control of sensor data

D. Galarus, R. Angryk
{"title":"Mining robust neighborhoods for quality control of sensor data","authors":"D. Galarus, R. Angryk","doi":"10.1145/2534303.2534309","DOIUrl":null,"url":null,"abstract":"Neighborhoods, as used for spatial and spatial-temporal data mining, define areas of similarity in data. Unless defined to account for outliers, missing data and spatial-temporal variation, the robustness of methods utilizing neighborhoods will suffer. The focus of this paper is to demonstrate that neighborhoods can be defined and used in a robust manner that is resistant to such challenges. Our approach employs robust methods in both neighborhood construction and neighborhood application to estimate observations. These methods were tested with a large weather sensor data set from the National Weather Service that includes quality control indicators. Results were compared to a popular method used in the weather community, evaluated by root-mean-squared error and grouped by quality control indicator. Our first time published results show that our methods are robust in the presence of outliers, missing data and spatial-temporal variation, yielding results consistent with quality control labels assigned to the data by the provider by way of an extensive rule-based system, indicating that our approaches show promise for use in quality control assessment.","PeriodicalId":190366,"journal":{"name":"International Workshop on GeoStreaming","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on GeoStreaming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2534303.2534309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Neighborhoods, as used for spatial and spatial-temporal data mining, define areas of similarity in data. Unless defined to account for outliers, missing data and spatial-temporal variation, the robustness of methods utilizing neighborhoods will suffer. The focus of this paper is to demonstrate that neighborhoods can be defined and used in a robust manner that is resistant to such challenges. Our approach employs robust methods in both neighborhood construction and neighborhood application to estimate observations. These methods were tested with a large weather sensor data set from the National Weather Service that includes quality control indicators. Results were compared to a popular method used in the weather community, evaluated by root-mean-squared error and grouped by quality control indicator. Our first time published results show that our methods are robust in the presence of outliers, missing data and spatial-temporal variation, yielding results consistent with quality control labels assigned to the data by the provider by way of an extensive rule-based system, indicating that our approaches show promise for use in quality control assessment.
挖掘鲁棒邻域用于传感器数据的质量控制
邻域用于空间和时空数据挖掘,定义数据中的相似区域。除非定义为考虑异常值、缺失数据和时空变化,否则利用邻域的方法的鲁棒性将受到影响。本文的重点是证明可以以一种抵抗此类挑战的稳健方式定义和使用社区。我们的方法在邻域构建和邻域应用两方面都采用了稳健的方法来估计观测值。这些方法通过国家气象局的大型气象传感器数据集(包括质量控制指标)进行了测试。结果与气象界使用的一种流行方法进行了比较,通过均方根误差进行评估,并根据质量控制指标进行分组。我们首次发表的结果表明,我们的方法在存在异常值、缺失数据和时空变化的情况下是稳健的,产生的结果与供应商通过广泛的基于规则的系统分配给数据的质量控制标签一致,表明我们的方法有望用于质量控制评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信