Outlier detection in weight time series of connected scales

Saeed Mehrang, E. Helander, M. Pavel, A. Chieh, I. Korhonen
{"title":"Outlier detection in weight time series of connected scales","authors":"Saeed Mehrang, E. Helander, M. Pavel, A. Chieh, I. Korhonen","doi":"10.1109/BIBM.2015.7359896","DOIUrl":null,"url":null,"abstract":"In principle, connected sensors allow effortless long-term self-monitoring of health and wellness that can help maintain health and quality of life. However, data collected in the “wild” may be noisy and contain outliers, e.g., due to uncontrolled sources or data from different persons using the same device. The removal of the “outliers” is therefore critical for accurate interpretation of the data. In this paper we study the detection and elimination of outliers in self-weighing time series data obtained from connected weight scales. We examined three techniques: (1) a method based on autoregressive integrated moving average (ARIMA) time series modelling, (2) median absolute deviation (MAD) scale estimate, and (3) a method based on Rosner statistics. We applied these methods to both a data set with real outliers and a clean data set corrupted with simulated outliers. The results suggest that the simple MAD algorithm and ARIMA performed well with both test sets while the Rosner statistics was significantly less effective. In addition, the ARIMA approach appeared to be significantly less sensitive to long periods of missing data than MAD and Rosner statistics.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

In principle, connected sensors allow effortless long-term self-monitoring of health and wellness that can help maintain health and quality of life. However, data collected in the “wild” may be noisy and contain outliers, e.g., due to uncontrolled sources or data from different persons using the same device. The removal of the “outliers” is therefore critical for accurate interpretation of the data. In this paper we study the detection and elimination of outliers in self-weighing time series data obtained from connected weight scales. We examined three techniques: (1) a method based on autoregressive integrated moving average (ARIMA) time series modelling, (2) median absolute deviation (MAD) scale estimate, and (3) a method based on Rosner statistics. We applied these methods to both a data set with real outliers and a clean data set corrupted with simulated outliers. The results suggest that the simple MAD algorithm and ARIMA performed well with both test sets while the Rosner statistics was significantly less effective. In addition, the ARIMA approach appeared to be significantly less sensitive to long periods of missing data than MAD and Rosner statistics.
连通尺度权重时间序列的异常值检测
原则上,连接的传感器可以轻松地长期自我监测健康状况,有助于保持健康和生活质量。然而,在“野外”收集的数据可能有噪声并包含异常值,例如,由于不受控制的来源或来自使用同一设备的不同人的数据。因此,去除“异常值”对于准确解释数据至关重要。本文研究了自称时间序列数据中异常值的检测与消除。我们研究了三种技术:(1)基于自回归积分移动平均(ARIMA)时间序列建模的方法,(2)中位数绝对偏差(MAD)尺度估计方法,以及(3)基于Rosner统计的方法。我们将这些方法应用于具有真实异常值的数据集和具有模拟异常值的干净数据集。结果表明,简单的MAD算法和ARIMA算法在两个测试集上都表现良好,而Rosner统计量的效果明显较差。此外,与MAD和Rosner统计相比,ARIMA方法对长期缺失数据的敏感性明显较低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信