可穿戴活动识别的多样化,现实和注释数据集的集合

I. Cleland, M. Donnelly, C. Nugent, J. Hallberg, M. Espinilla, M. Garcia-Constantino
{"title":"可穿戴活动识别的多样化,现实和注释数据集的集合","authors":"I. Cleland, M. Donnelly, C. Nugent, J. Hallberg, M. Espinilla, M. Garcia-Constantino","doi":"10.1109/PERCOMW.2018.8480322","DOIUrl":null,"url":null,"abstract":"This paper discusses the opportunities and challenges associated with the collection of a large scale, diverse dataset for Activity Recognition. The dataset was collected by 141 undergraduate students, in a controlled environment. Students collected triaxial accelerometer data from a wearable accelerometer whilst each carrying out 3 of the 18 investigated activities, categorized into 6 scenarios of daily living. This data was subsequently labelled, anonymized and uploaded to a shared repository. This paper presents an analysis of data quality, through outlier detection and assesses the suitability of the dataset for the creation and validation of Activity Recognition models. This is achieved through the application of a range of common data driven machine learning approaches. Finally, the paper describes challenges identified during the data collection process and discusses how these could be addressed. Issues surrounding data quality, in particular, identifying and addressing poor calibration of the data were identified. Results highlight the potential of harnessing these diverse data for Activity Recognition. Based on a comparison of six classification approaches, a Random Forest provided the best classification (F-measure: 0.88). In future data collection cycles, participants will be encouraged to collect a set of “common” activities, to support generation of a larger homogeneous dataset. Future work will seek to refine the methodology further and to evaluate model on new unseen data.","PeriodicalId":190096,"journal":{"name":"2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition\",\"authors\":\"I. Cleland, M. Donnelly, C. Nugent, J. Hallberg, M. Espinilla, M. Garcia-Constantino\",\"doi\":\"10.1109/PERCOMW.2018.8480322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper discusses the opportunities and challenges associated with the collection of a large scale, diverse dataset for Activity Recognition. The dataset was collected by 141 undergraduate students, in a controlled environment. Students collected triaxial accelerometer data from a wearable accelerometer whilst each carrying out 3 of the 18 investigated activities, categorized into 6 scenarios of daily living. This data was subsequently labelled, anonymized and uploaded to a shared repository. This paper presents an analysis of data quality, through outlier detection and assesses the suitability of the dataset for the creation and validation of Activity Recognition models. This is achieved through the application of a range of common data driven machine learning approaches. Finally, the paper describes challenges identified during the data collection process and discusses how these could be addressed. Issues surrounding data quality, in particular, identifying and addressing poor calibration of the data were identified. Results highlight the potential of harnessing these diverse data for Activity Recognition. Based on a comparison of six classification approaches, a Random Forest provided the best classification (F-measure: 0.88). In future data collection cycles, participants will be encouraged to collect a set of “common” activities, to support generation of a larger homogeneous dataset. Future work will seek to refine the methodology further and to evaluate model on new unseen data.\",\"PeriodicalId\":190096,\"journal\":{\"name\":\"2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PERCOMW.2018.8480322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PERCOMW.2018.8480322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

摘要

本文讨论了与收集大规模、多样化的活动识别数据集相关的机遇和挑战。该数据集是由141名本科生在受控环境中收集的。学生们从可穿戴式加速度计中收集三轴加速度计数据,同时每人进行18项调查活动中的3项,这些活动被分为6种日常生活场景。这些数据随后被标记、匿名并上传到共享存储库。本文通过异常值检测对数据质量进行了分析,并评估了数据集对活动识别模型的创建和验证的适用性。这是通过应用一系列常见的数据驱动机器学习方法来实现的。最后,本文描述了在数据收集过程中发现的挑战,并讨论了如何解决这些挑战。确定了围绕数据质量的问题,特别是识别和解决数据校准不良的问题。结果强调了利用这些不同数据进行活动识别的潜力。基于6种分类方法的比较,随机森林提供了最好的分类(F-measure: 0.88)。在未来的数据收集周期中,将鼓励参与者收集一组“公共”活动,以支持生成更大的同构数据集。未来的工作将寻求进一步完善的方法,并评估新的看不见的数据模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition
This paper discusses the opportunities and challenges associated with the collection of a large scale, diverse dataset for Activity Recognition. The dataset was collected by 141 undergraduate students, in a controlled environment. Students collected triaxial accelerometer data from a wearable accelerometer whilst each carrying out 3 of the 18 investigated activities, categorized into 6 scenarios of daily living. This data was subsequently labelled, anonymized and uploaded to a shared repository. This paper presents an analysis of data quality, through outlier detection and assesses the suitability of the dataset for the creation and validation of Activity Recognition models. This is achieved through the application of a range of common data driven machine learning approaches. Finally, the paper describes challenges identified during the data collection process and discusses how these could be addressed. Issues surrounding data quality, in particular, identifying and addressing poor calibration of the data were identified. Results highlight the potential of harnessing these diverse data for Activity Recognition. Based on a comparison of six classification approaches, a Random Forest provided the best classification (F-measure: 0.88). In future data collection cycles, participants will be encouraged to collect a set of “common” activities, to support generation of a larger homogeneous dataset. Future work will seek to refine the methodology further and to evaluate model on new unseen data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信