Daoliang Xu , Tianyou Zheng , Yang Zhang , Xiaodong Yang , Weiwei Fu
{"title":"基于运动语义扩展的运动文本检索方法","authors":"Daoliang Xu , Tianyou Zheng , Yang Zhang , Xiaodong Yang , Weiwei Fu","doi":"10.1016/j.neucom.2025.130632","DOIUrl":null,"url":null,"abstract":"<div><div>The motion-text cross-retrieval task aims to bridge the motion and text spaces, enabling mutual retrieval between motion and language. However, existing methods suffer from limited feature extraction due to both insufficient data and inadequate feature extraction techniques, which restrict retrieval accuracy and semantic richness. To address this, we propose a Motion-Text Retrieval Method Based on Motion Semantics Expansion (MTR-MSE). We design specialized motion and text encoders to create a comprehensive shared feature space. Furthermore, recognizing the limitations of overly simplistic textual descriptions in existing datasets, we enhance motion semantics using large language models to generate more detailed and varied descriptions, thereby improving motion understanding. Experimental results demonstrate that our method achieves state-of-the-art performance, validating its effectiveness in addressing the challenges of cross-modal motion-text retrieval.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130632"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MTR-MSE: Motion-Text Retrieval Method Based on Motion Semantics Expansion\",\"authors\":\"Daoliang Xu , Tianyou Zheng , Yang Zhang , Xiaodong Yang , Weiwei Fu\",\"doi\":\"10.1016/j.neucom.2025.130632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The motion-text cross-retrieval task aims to bridge the motion and text spaces, enabling mutual retrieval between motion and language. However, existing methods suffer from limited feature extraction due to both insufficient data and inadequate feature extraction techniques, which restrict retrieval accuracy and semantic richness. To address this, we propose a Motion-Text Retrieval Method Based on Motion Semantics Expansion (MTR-MSE). We design specialized motion and text encoders to create a comprehensive shared feature space. Furthermore, recognizing the limitations of overly simplistic textual descriptions in existing datasets, we enhance motion semantics using large language models to generate more detailed and varied descriptions, thereby improving motion understanding. Experimental results demonstrate that our method achieves state-of-the-art performance, validating its effectiveness in addressing the challenges of cross-modal motion-text retrieval.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"648 \",\"pages\":\"Article 130632\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225013049\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013049","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
MTR-MSE: Motion-Text Retrieval Method Based on Motion Semantics Expansion
The motion-text cross-retrieval task aims to bridge the motion and text spaces, enabling mutual retrieval between motion and language. However, existing methods suffer from limited feature extraction due to both insufficient data and inadequate feature extraction techniques, which restrict retrieval accuracy and semantic richness. To address this, we propose a Motion-Text Retrieval Method Based on Motion Semantics Expansion (MTR-MSE). We design specialized motion and text encoders to create a comprehensive shared feature space. Furthermore, recognizing the limitations of overly simplistic textual descriptions in existing datasets, we enhance motion semantics using large language models to generate more detailed and varied descriptions, thereby improving motion understanding. Experimental results demonstrate that our method achieves state-of-the-art performance, validating its effectiveness in addressing the challenges of cross-modal motion-text retrieval.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.