Motif clustering and digital biomarker extraction for free-living physical activity analysis.

IF 6.1 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Biodata Mining Pub Date : 2025-01-22 DOI:10.1186/s13040-025-00424-1

Ya-Ting Liang, Charlotte Wang

{"title":"Motif clustering and digital biomarker extraction for free-living physical activity analysis.","authors":"Ya-Ting Liang, Charlotte Wang","doi":"10.1186/s13040-025-00424-1","DOIUrl":null,"url":null,"abstract":"Background: Analyzing free-living physical activity (PA) data presents challenges due to variability in daily routines and the lack of activity labels. Traditional approaches often rely on summary statistics, which may not capture the nuances of individual activity patterns. To address these limitations and advance our understanding of the relationship between PA patterns and health outcomes, we propose a novel motif clustering algorithm that identifies and characterizes specific PA patterns.Methods: This paper proposes an elastic distance-based motif clustering algorithm for identifying specific PA patterns (motifs) in free-living PA data. The algorithm segments long-term PA curves into short-term segments and utilizes elastic shape analysis to measure the similarity between activity segments. This enables the discovery of recurring motifs through pattern clustering. Then, functional principal component analysis (FPCA) is then used to extract digital biomarkers from each motif. These digital biomarkers can subsequently be used to explore the relationship between PA and health outcomes of interest.Results: We demonstrate the efficacy of our method through three real-world applications. Results show that digital biomarkers derived from these motifs effectively capture the association between PA patterns and disease outcomes, improving the accuracy of patient classification.Conclusions: This study introduced a novel approach to analyzing free-living PA data by identifying and characterizing specific activity patterns (motifs). The derived digital biomarkers provide a more nuanced understanding of PA and its impact on health, with potential applications in personalized health assessment and disease detection, offering a promising future for healthcare.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"8"},"PeriodicalIF":6.1000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753168/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-025-00424-1","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Analyzing free-living physical activity (PA) data presents challenges due to variability in daily routines and the lack of activity labels. Traditional approaches often rely on summary statistics, which may not capture the nuances of individual activity patterns. To address these limitations and advance our understanding of the relationship between PA patterns and health outcomes, we propose a novel motif clustering algorithm that identifies and characterizes specific PA patterns.

Methods: This paper proposes an elastic distance-based motif clustering algorithm for identifying specific PA patterns (motifs) in free-living PA data. The algorithm segments long-term PA curves into short-term segments and utilizes elastic shape analysis to measure the similarity between activity segments. This enables the discovery of recurring motifs through pattern clustering. Then, functional principal component analysis (FPCA) is then used to extract digital biomarkers from each motif. These digital biomarkers can subsequently be used to explore the relationship between PA and health outcomes of interest.

Results: We demonstrate the efficacy of our method through three real-world applications. Results show that digital biomarkers derived from these motifs effectively capture the association between PA patterns and disease outcomes, improving the accuracy of patient classification.

Conclusions: This study introduced a novel approach to analyzing free-living PA data by identifying and characterizing specific activity patterns (motifs). The derived digital biomarkers provide a more nuanced understanding of PA and its impact on health, with potential applications in personalized health assessment and disease detection, offering a promising future for healthcare.

查看原文本刊更多论文

基序聚类和数字生物标记提取用于自由生活的身体活动分析。

背景：由于日常生活的可变性和缺乏活动标签，分析自由生活的身体活动（PA）数据面临挑战。传统的方法通常依赖于汇总统计，这可能无法捕捉到个体活动模式的细微差别。为了解决这些限制并促进我们对PA模式与健康结果之间关系的理解，我们提出了一种新的基序聚类算法，该算法可以识别和表征特定的PA模式。方法：本文提出了一种基于弹性距离的基序聚类算法，用于识别自由生活的PA数据中的特定PA模式（motif）。该算法将长期PA曲线分割为短期段，并利用弹性形状分析来衡量活动段之间的相似性。这使得通过模式聚类发现重复出现的主题成为可能。然后，使用功能主成分分析（FPCA）从每个基序中提取数字生物标志物。这些数字生物标志物随后可用于探索PA与感兴趣的健康结果之间的关系。结果：我们通过三个实际应用证明了我们的方法的有效性。结果表明，来自这些基序的数字生物标志物有效地捕获了PA模式与疾病结果之间的关联，提高了患者分类的准确性。结论：本研究引入了一种新的方法，通过识别和表征特定的活动模式（基序）来分析自由生活的PA数据。衍生的数字生物标志物提供了对PA及其对健康影响的更细致的理解，在个性化健康评估和疾病检测中具有潜在的应用，为医疗保健提供了一个充满希望的未来。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.