利用遮蔽式自动编码器阐明行为的层次性质

bioRxiv Pub Date : 2024-08-08 DOI:10.1101/2024.08.06.606796

Lucas Stoffl, Andy Bonnetto, Stéphane d’Ascoli, Alexander Mathis

{"title":"利用遮蔽式自动编码器阐明行为的层次性质","authors":"Lucas Stoffl, Andy Bonnetto, Stéphane d’Ascoli, Alexander Mathis","doi":"10.1101/2024.08.06.606796","DOIUrl":null,"url":null,"abstract":"Natural behavior is hierarchical. Yet, there is a paucity of benchmarks addressing this aspect. Recognizing the scarcity of large-scale hierarchical behavioral benchmarks, we create a novel synthetic basketball playing benchmark (Shot7M2). Beyond synthetic data, we extend BABEL into a hierarchical action segmentation benchmark (hBABEL). Then, we develop a masked autoencoder framework (hBehaveMAE) to elucidate the hierarchical nature of motion capture data in an unsupervised fashion. We find that hBehaveMAE learns interpretable latents on Shot7M2 and hBABEL, where lower encoder levels show a superior ability to represent fine-grained movements, while higher encoder levels capture complex actions and activities. Additionally, we evaluate hBehaveMAE on MABe22, a representation learning benchmark with short and long-term behavioral states. hBehaveMAE achieves state-of-the-art performance without domain-specific feature extraction. Together, these components synergistically contribute towards unveiling the hierarchical organization of natural behavior. Models and benchmarks are available at https://github.com/amathislab/BehaveMAE.","PeriodicalId":505198,"journal":{"name":"bioRxiv","volume":"29 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders\",\"authors\":\"Lucas Stoffl, Andy Bonnetto, Stéphane d’Ascoli, Alexander Mathis\",\"doi\":\"10.1101/2024.08.06.606796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural behavior is hierarchical. Yet, there is a paucity of benchmarks addressing this aspect. Recognizing the scarcity of large-scale hierarchical behavioral benchmarks, we create a novel synthetic basketball playing benchmark (Shot7M2). Beyond synthetic data, we extend BABEL into a hierarchical action segmentation benchmark (hBABEL). Then, we develop a masked autoencoder framework (hBehaveMAE) to elucidate the hierarchical nature of motion capture data in an unsupervised fashion. We find that hBehaveMAE learns interpretable latents on Shot7M2 and hBABEL, where lower encoder levels show a superior ability to represent fine-grained movements, while higher encoder levels capture complex actions and activities. Additionally, we evaluate hBehaveMAE on MABe22, a representation learning benchmark with short and long-term behavioral states. hBehaveMAE achieves state-of-the-art performance without domain-specific feature extraction. Together, these components synergistically contribute towards unveiling the hierarchical organization of natural behavior. Models and benchmarks are available at https://github.com/amathislab/BehaveMAE.\",\"PeriodicalId\":505198,\"journal\":{\"name\":\"bioRxiv\",\"volume\":\"29 7\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.08.06.606796\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.06.606796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自然行为是分等级的。然而，针对这方面的基准却很少。认识到大规模分层行为基准的稀缺性，我们创建了一个新颖的合成篮球比赛基准（Shot7M2）。除了合成数据，我们还将 BABEL 扩展为分层动作分割基准（hBABEL）。然后，我们开发了一个掩码自动编码器框架（hBehaveMAE），以无监督的方式阐明动作捕捉数据的层次性。我们发现，hBehaveMAE 可以在 Shot7M2 和 hBABEL 上学习可解释的潜变量，其中较低的编码器级别在表示细粒度动作方面表现出卓越的能力，而较高的编码器级别则可以捕捉复杂的动作和活动。此外，我们还在具有短期和长期行为状态的表征学习基准 MABe22 上对 hBehaveMAE 进行了评估。这些组件的协同作用有助于揭示自然行为的层次组织。模型和基准可在 https://github.com/amathislab/BehaveMAE 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders

Natural behavior is hierarchical. Yet, there is a paucity of benchmarks addressing this aspect. Recognizing the scarcity of large-scale hierarchical behavioral benchmarks, we create a novel synthetic basketball playing benchmark (Shot7M2). Beyond synthetic data, we extend BABEL into a hierarchical action segmentation benchmark (hBABEL). Then, we develop a masked autoencoder framework (hBehaveMAE) to elucidate the hierarchical nature of motion capture data in an unsupervised fashion. We find that hBehaveMAE learns interpretable latents on Shot7M2 and hBABEL, where lower encoder levels show a superior ability to represent fine-grained movements, while higher encoder levels capture complex actions and activities. Additionally, we evaluate hBehaveMAE on MABe22, a representation learning benchmark with short and long-term behavioral states. hBehaveMAE achieves state-of-the-art performance without domain-specific feature extraction. Together, these components synergistically contribute towards unveiling the hierarchical organization of natural behavior. Models and benchmarks are available at https://github.com/amathislab/BehaveMAE.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

bioRxiv

自引率

0.00%

发文量