Data should be made as simple as possible but not simpler: The method chosen for dimensionality reduction and its parameters can affect the clustering of runners based on their kinematics.

IF 2.4 3区 医学 Q3 BIOPHYSICS
Adrian R Rivadulla, Xi Chen, Dario Cazzola, Grant Trewartha, Ezio Preatoni
{"title":"Data should be made as simple as possible but not simpler: The method chosen for dimensionality reduction and its parameters can affect the clustering of runners based on their kinematics.","authors":"Adrian R Rivadulla, Xi Chen, Dario Cazzola, Grant Trewartha, Ezio Preatoni","doi":"10.1016/j.jbiomech.2024.112433","DOIUrl":null,"url":null,"abstract":"<p><p>Dimensionality reduction is a critical step for the efficacy and efficiency of clustering analysis. Despite the multiple available methods, biomechanists have often defaulted to Principal Component Analysis (PCA). We evaluated two PCA- and one autoencoder-based dimensionality reduction methods for their data compression and reconstruction capability, assessed their effect on the output of clustering runners' based on kinematics, and discussed their implications for the biomechanical assessment of running technique. Eighty-four participants completed a 4-minute run at 12 km/h while trunk and lower-limb kinematics were collected. Data reconstruction quality was assessed for Direct PCA (PCA directly on original variables) and Fourier PCA (modelling time series as Fourier series and then applying PCA) using popular variance explained criteria; and a feedforward autoencoder (AE). Agglomerative hierarchical clustering was then applied and the agreement between the resulting partitions was assessed. Meaningful errors in the reconstructed signals were found when applying popular variance explained criteria, suggesting reconstruction error should be assessed to make a more informed decision about how many components to retain for further analysis. Direct PCA, Fourier PCA and AE yielded different clusters, warranting caution when comparing outcomes from studies that use different dimensionality reduction techniques: each method may be sensitive to different data features. Direct PCA retaining 99 % of the original variance emerged as the best compromise of data compression, reconstruction quality and cluster separability in our dataset. We encourage biomechanists to experiment with diverse dimensionality reduction methods to optimise clustering outcomes and enhance the real-world applicability of their findings.</p>","PeriodicalId":15168,"journal":{"name":"Journal of biomechanics","volume":"177 ","pages":"112433"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of biomechanics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.jbiomech.2024.112433","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Dimensionality reduction is a critical step for the efficacy and efficiency of clustering analysis. Despite the multiple available methods, biomechanists have often defaulted to Principal Component Analysis (PCA). We evaluated two PCA- and one autoencoder-based dimensionality reduction methods for their data compression and reconstruction capability, assessed their effect on the output of clustering runners' based on kinematics, and discussed their implications for the biomechanical assessment of running technique. Eighty-four participants completed a 4-minute run at 12 km/h while trunk and lower-limb kinematics were collected. Data reconstruction quality was assessed for Direct PCA (PCA directly on original variables) and Fourier PCA (modelling time series as Fourier series and then applying PCA) using popular variance explained criteria; and a feedforward autoencoder (AE). Agglomerative hierarchical clustering was then applied and the agreement between the resulting partitions was assessed. Meaningful errors in the reconstructed signals were found when applying popular variance explained criteria, suggesting reconstruction error should be assessed to make a more informed decision about how many components to retain for further analysis. Direct PCA, Fourier PCA and AE yielded different clusters, warranting caution when comparing outcomes from studies that use different dimensionality reduction techniques: each method may be sensitive to different data features. Direct PCA retaining 99 % of the original variance emerged as the best compromise of data compression, reconstruction quality and cluster separability in our dataset. We encourage biomechanists to experiment with diverse dimensionality reduction methods to optimise clustering outcomes and enhance the real-world applicability of their findings.

数据应尽可能简单,但不能更简单:所选择的降维方法及其参数会影响根据运动学对跑步者进行分组。
降维是提高聚类分析效果和效率的关键步骤。尽管有多种可用的方法,但生物力学专家通常还是会选择主成分分析法(PCA)。我们评估了两种基于 PCA 的降维方法和一种基于自动编码器的降维方法的数据压缩和重建能力,评估了它们对根据运动学对跑步者进行聚类的输出结果的影响,并讨论了它们对跑步技术的生物力学评估的影响。84 名参与者以 12 公里/小时的速度完成了 4 分钟的跑步,同时收集了躯干和下肢的运动学数据。使用流行的方差解释标准和前馈自动编码器(AE)评估了直接 PCA(直接对原始变量进行 PCA)和傅里叶 PCA(将时间序列建模为傅里叶序列,然后应用 PCA)的数据重建质量。然后应用聚合分层聚类,并评估所产生的分区之间的一致性。在应用流行的方差解释标准时,发现重建信号中存在有意义的误差,这表明应该对重建误差进行评估,以便就保留多少成分进行进一步分析做出更明智的决定。直接 PCA、傅立叶 PCA 和 AE 产生了不同的聚类,因此在比较使用不同降维技术的研究结果时需要谨慎:每种方法可能对不同的数据特征敏感。在我们的数据集中,保留 99% 原始方差的直接 PCA 是数据压缩、重建质量和聚类可分性的最佳折衷方案。我们鼓励生物力学家尝试使用不同的降维方法,以优化聚类结果,提高研究结果在现实世界中的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of biomechanics
Journal of biomechanics 生物-工程:生物医学
CiteScore
5.10
自引率
4.20%
发文量
345
审稿时长
1 months
期刊介绍: The Journal of Biomechanics publishes reports of original and substantial findings using the principles of mechanics to explore biological problems. Analytical, as well as experimental papers may be submitted, and the journal accepts original articles, surveys and perspective articles (usually by Editorial invitation only), book reviews and letters to the Editor. The criteria for acceptance of manuscripts include excellence, novelty, significance, clarity, conciseness and interest to the readership. Papers published in the journal may cover a wide range of topics in biomechanics, including, but not limited to: -Fundamental Topics - Biomechanics of the musculoskeletal, cardiovascular, and respiratory systems, mechanics of hard and soft tissues, biofluid mechanics, mechanics of prostheses and implant-tissue interfaces, mechanics of cells. -Cardiovascular and Respiratory Biomechanics - Mechanics of blood-flow, air-flow, mechanics of the soft tissues, flow-tissue or flow-prosthesis interactions. -Cell Biomechanics - Biomechanic analyses of cells, membranes and sub-cellular structures; the relationship of the mechanical environment to cell and tissue response. -Dental Biomechanics - Design and analysis of dental tissues and prostheses, mechanics of chewing. -Functional Tissue Engineering - The role of biomechanical factors in engineered tissue replacements and regenerative medicine. -Injury Biomechanics - Mechanics of impact and trauma, dynamics of man-machine interaction. -Molecular Biomechanics - Mechanical analyses of biomolecules. -Orthopedic Biomechanics - Mechanics of fracture and fracture fixation, mechanics of implants and implant fixation, mechanics of bones and joints, wear of natural and artificial joints. -Rehabilitation Biomechanics - Analyses of gait, mechanics of prosthetics and orthotics. -Sports Biomechanics - Mechanical analyses of sports performance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信