Jaime Pabón, Daniel Gómez, Jesús D Cerón, Ricardo Salazar-Cabrera, Diego M López, Bernd Blobel
{"title":"A Comprehensive Dataset for Activity of Daily Living (ADL) Research Compiled by Unifying and Processing Multiple Data Sources.","authors":"Jaime Pabón, Daniel Gómez, Jesús D Cerón, Ricardo Salazar-Cabrera, Diego M López, Bernd Blobel","doi":"10.3390/jpm15050210","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background</b>: Activities of Daily Living (ADLs) are essential tasks performed at home and used in healthcare to monitor sedentary behavior, track rehabilitation therapy, and monitor chronic obstructive pulmonary disease. The Barthel Index, used by healthcare professionals, has limitations due to its subjectivity. Human activity recognition (HAR) is a more accurate method using Information and Communication Technologies (ICTs) to assess ADLs more accurately. This work aims to create a singular, adaptable, and heterogeneous ADL dataset that integrates information from various sources, ensuring a rich representation of different individuals and environments. <b>Methods</b>: A literature review was conducted in Scopus, the University of California Irvine (UCI) Machine Learning Repository, Google Dataset Search, and the University of Cauca Repository to obtain datasets related to ADLs. Inclusion criteria were defined, and a list of dataset characteristics was made to integrate multiple datasets. Twenty-nine datasets were identified, including data from various accelerometers, gyroscopes, inclinometers, and heart rate monitors. These datasets were classified and analyzed from the review. Tasks such as dataset selection, categorization, analysis, cleaning, normalization, and data integration were performed. <b>Results:</b> The resulting unified dataset contained 238,990 samples, 56 activities, and 52 columns. The integrated dataset features a wealth of information from diverse individuals and environments, improving its adaptability for various applications. <b>Conclusions:</b> In particular, it can be used in various data science projects related to ADL and HAR, and due to the integration of diverse data sources, it is potentially useful in addressing bias in and improving the generalizability of machine learning models.</p>","PeriodicalId":16722,"journal":{"name":"Journal of Personalized Medicine","volume":"15 5","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113171/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personalized Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jpm15050210","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Activities of Daily Living (ADLs) are essential tasks performed at home and used in healthcare to monitor sedentary behavior, track rehabilitation therapy, and monitor chronic obstructive pulmonary disease. The Barthel Index, used by healthcare professionals, has limitations due to its subjectivity. Human activity recognition (HAR) is a more accurate method using Information and Communication Technologies (ICTs) to assess ADLs more accurately. This work aims to create a singular, adaptable, and heterogeneous ADL dataset that integrates information from various sources, ensuring a rich representation of different individuals and environments. Methods: A literature review was conducted in Scopus, the University of California Irvine (UCI) Machine Learning Repository, Google Dataset Search, and the University of Cauca Repository to obtain datasets related to ADLs. Inclusion criteria were defined, and a list of dataset characteristics was made to integrate multiple datasets. Twenty-nine datasets were identified, including data from various accelerometers, gyroscopes, inclinometers, and heart rate monitors. These datasets were classified and analyzed from the review. Tasks such as dataset selection, categorization, analysis, cleaning, normalization, and data integration were performed. Results: The resulting unified dataset contained 238,990 samples, 56 activities, and 52 columns. The integrated dataset features a wealth of information from diverse individuals and environments, improving its adaptability for various applications. Conclusions: In particular, it can be used in various data science projects related to ADL and HAR, and due to the integration of diverse data sources, it is potentially useful in addressing bias in and improving the generalizability of machine learning models.
背景:日常生活活动(ADLs)是在家完成的基本任务,在医疗保健中用于监测久坐行为,跟踪康复治疗和监测慢性阻塞性肺疾病。医疗保健专业人员使用的Barthel指数由于其主观性而存在局限性。人类活动识别(HAR)是一种更准确的方法,利用信息和通信技术(ict)更准确地评估ADLs。这项工作旨在创建一个单一的、可适应的、异构的ADL数据集,该数据集集成了来自各种来源的信息,确保了不同个体和环境的丰富表示。方法:在Scopus、加州大学欧文分校(University of California Irvine, UCI)机器学习知识库、谷歌Dataset Search和考卡大学知识库中进行文献综述,获取与adl相关的数据集。定义了纳入标准,并制作了数据集特征列表以整合多个数据集。确定了29个数据集,包括来自各种加速度计、陀螺仪、倾角计和心率监测仪的数据。从综述中对这些数据集进行分类和分析。执行数据集选择、分类、分析、清理、规范化和数据集成等任务。结果:得到的统一数据集包含238,990个样本,56个活动,52列。集成的数据集具有来自不同个体和环境的丰富信息,提高了其对各种应用的适应性。结论:特别是,它可以用于与ADL和HAR相关的各种数据科学项目,并且由于集成了各种数据源,它在解决机器学习模型中的偏见和提高机器学习模型的泛化性方面具有潜在的用途。
期刊介绍:
Journal of Personalized Medicine (JPM; ISSN 2075-4426) is an international, open access journal aimed at bringing all aspects of personalized medicine to one platform. JPM publishes cutting edge, innovative preclinical and translational scientific research and technologies related to personalized medicine (e.g., pharmacogenomics/proteomics, systems biology). JPM recognizes that personalized medicine—the assessment of genetic, environmental and host factors that cause variability of individuals—is a challenging, transdisciplinary topic that requires discussions from a range of experts. For a comprehensive perspective of personalized medicine, JPM aims to integrate expertise from the molecular and translational sciences, therapeutics and diagnostics, as well as discussions of regulatory, social, ethical and policy aspects. We provide a forum to bring together academic and clinical researchers, biotechnology, diagnostic and pharmaceutical companies, health professionals, regulatory and ethical experts, and government and regulatory authorities.