Jianbin Zheng, Chaojie Wang, Liping Huang, Yifan Gao, Ruoxi Yan, Chunbo Yang, Yang Gao, Yu Wang
{"title":"用于下肢外骨骼运动模式识别的卷积增强视觉变换器方法","authors":"Jianbin Zheng, Chaojie Wang, Liping Huang, Yifan Gao, Ruoxi Yan, Chunbo Yang, Yang Gao, Yu Wang","doi":"10.1111/exsy.13659","DOIUrl":null,"url":null,"abstract":"<p>Providing the human body with smooth and natural assistance through lower limb exoskeletons is crucial. However, a significant challenge is identifying various locomotion modes to enable the exoskeleton to offer seamless support. In this study, we propose a method for locomotion mode recognition named Convolution-enhanced Vision Transformer (Conv-ViT). This method maximizes the benefits of convolution for feature extraction and fusion, as well as the self-attention mechanism of the Transformer, to efficiently capture and handle long-term dependencies among different positions within the input sequence. By equipping the exoskeleton with inertial measurement units, we collected motion data from 27 healthy subjects, using it as input to train the Conv-ViT model. To ensure the exoskeleton's stability and safety during transitions between various locomotion modes, we not only examined the typical five steady modes (involving walking on level ground [WL], stair ascent [SA], stair descent [SD], ramp ascent [RA], and ramp descent [RD]) but also extensively explored eight locomotion transitions (including WL-SA, WL-SD, WL-RA, WL-RD, SA-WL, SD-WL, RA-WL, RD-WL). In tasks involving the recognition of five steady locomotions and eight transitions, the recognition accuracy reached 98.87% and 96.74%, respectively. Compared with three popular algorithms, ViT, convolutional neural networks, and support vector machine, the results show that the proposed method has the best recognition performance, and there are highly significant differences in accuracy and F1 score compared to other methods. Finally, we also demonstrated the excellent performance of Conv-ViT in terms of generalization performance.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"41 10","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Convolution-enhanced vision transformer method for lower limb exoskeleton locomotion mode recognition\",\"authors\":\"Jianbin Zheng, Chaojie Wang, Liping Huang, Yifan Gao, Ruoxi Yan, Chunbo Yang, Yang Gao, Yu Wang\",\"doi\":\"10.1111/exsy.13659\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Providing the human body with smooth and natural assistance through lower limb exoskeletons is crucial. However, a significant challenge is identifying various locomotion modes to enable the exoskeleton to offer seamless support. In this study, we propose a method for locomotion mode recognition named Convolution-enhanced Vision Transformer (Conv-ViT). This method maximizes the benefits of convolution for feature extraction and fusion, as well as the self-attention mechanism of the Transformer, to efficiently capture and handle long-term dependencies among different positions within the input sequence. By equipping the exoskeleton with inertial measurement units, we collected motion data from 27 healthy subjects, using it as input to train the Conv-ViT model. To ensure the exoskeleton's stability and safety during transitions between various locomotion modes, we not only examined the typical five steady modes (involving walking on level ground [WL], stair ascent [SA], stair descent [SD], ramp ascent [RA], and ramp descent [RD]) but also extensively explored eight locomotion transitions (including WL-SA, WL-SD, WL-RA, WL-RD, SA-WL, SD-WL, RA-WL, RD-WL). In tasks involving the recognition of five steady locomotions and eight transitions, the recognition accuracy reached 98.87% and 96.74%, respectively. Compared with three popular algorithms, ViT, convolutional neural networks, and support vector machine, the results show that the proposed method has the best recognition performance, and there are highly significant differences in accuracy and F1 score compared to other methods. Finally, we also demonstrated the excellent performance of Conv-ViT in terms of generalization performance.</p>\",\"PeriodicalId\":51053,\"journal\":{\"name\":\"Expert Systems\",\"volume\":\"41 10\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/exsy.13659\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.13659","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Providing the human body with smooth and natural assistance through lower limb exoskeletons is crucial. However, a significant challenge is identifying various locomotion modes to enable the exoskeleton to offer seamless support. In this study, we propose a method for locomotion mode recognition named Convolution-enhanced Vision Transformer (Conv-ViT). This method maximizes the benefits of convolution for feature extraction and fusion, as well as the self-attention mechanism of the Transformer, to efficiently capture and handle long-term dependencies among different positions within the input sequence. By equipping the exoskeleton with inertial measurement units, we collected motion data from 27 healthy subjects, using it as input to train the Conv-ViT model. To ensure the exoskeleton's stability and safety during transitions between various locomotion modes, we not only examined the typical five steady modes (involving walking on level ground [WL], stair ascent [SA], stair descent [SD], ramp ascent [RA], and ramp descent [RD]) but also extensively explored eight locomotion transitions (including WL-SA, WL-SD, WL-RA, WL-RD, SA-WL, SD-WL, RA-WL, RD-WL). In tasks involving the recognition of five steady locomotions and eight transitions, the recognition accuracy reached 98.87% and 96.74%, respectively. Compared with three popular algorithms, ViT, convolutional neural networks, and support vector machine, the results show that the proposed method has the best recognition performance, and there are highly significant differences in accuracy and F1 score compared to other methods. Finally, we also demonstrated the excellent performance of Conv-ViT in terms of generalization performance.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.