Dechao Chen;Zhengwen Chen;Xiangyan Zheng;Weiling Xu;Chencong Ma;Chentao Mao
{"title":"ADP:自适应扩散策略在学习和实践中激发机器人思维","authors":"Dechao Chen;Zhengwen Chen;Xiangyan Zheng;Weiling Xu;Chencong Ma;Chentao Mao","doi":"10.1109/TASE.2025.3612396","DOIUrl":null,"url":null,"abstract":"Adaptive control policies for robots often require balancing generalization from large offline datasets with efficient adaptation to specific deployment conditions. In this paper, we propose Adaptive Diffusion Policy (ADP), a two-stage framework that integrates temporal-aware diffusion models with parameter-efficient LoRA adaptation. First, in the learning stage, ADP imitates and generates actions based on image and video signals from a meager amount of expert demonstrations, considering both spatial and temporal information. This component contrasts with most existing works, which focus solely on spatial information. Second, in the practice stage, ADP incorporates a low-rank adaptation module into the policy, subsequently training it using residual reinforcement learning with minimal environment interaction. Experiments conducted on Meta-World benchmark demonstrate the efficiency of each ADP component and the superiority of ADP over representative baseline methods. Note to Practitioners—This work introduces Adaptive Diffusion Policy (ADP), a two-stage visuomotor framework that first learns from just a few image-and-video demonstrations by modeling both spatial and temporal cues, then rapidly refines its behavior via a lightweight low-rank adapter and residual reinforcement learning. The ADP enables swift skill acquisition on new tasks with minimal expert data and limited environment trials, making it ideal for industrial or household robots where extensive data collection is impractical. To apply ADP, collect a small set of demonstration clips, train the diffusion-based policy offline, and deploy the adapter online for in situ fine-tuning. The proposed Meta-World results show the ADP’s consistent gains over standard imitation and residual RL baselines, which is very easy for practitioners in multiple real-world robot scenarios.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"21585-21594"},"PeriodicalIF":6.4000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ADP: Adaptive Diffusion Policy Energizes Robots Thinking in Both Learning and Practice\",\"authors\":\"Dechao Chen;Zhengwen Chen;Xiangyan Zheng;Weiling Xu;Chencong Ma;Chentao Mao\",\"doi\":\"10.1109/TASE.2025.3612396\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Adaptive control policies for robots often require balancing generalization from large offline datasets with efficient adaptation to specific deployment conditions. In this paper, we propose Adaptive Diffusion Policy (ADP), a two-stage framework that integrates temporal-aware diffusion models with parameter-efficient LoRA adaptation. First, in the learning stage, ADP imitates and generates actions based on image and video signals from a meager amount of expert demonstrations, considering both spatial and temporal information. This component contrasts with most existing works, which focus solely on spatial information. Second, in the practice stage, ADP incorporates a low-rank adaptation module into the policy, subsequently training it using residual reinforcement learning with minimal environment interaction. Experiments conducted on Meta-World benchmark demonstrate the efficiency of each ADP component and the superiority of ADP over representative baseline methods. Note to Practitioners—This work introduces Adaptive Diffusion Policy (ADP), a two-stage visuomotor framework that first learns from just a few image-and-video demonstrations by modeling both spatial and temporal cues, then rapidly refines its behavior via a lightweight low-rank adapter and residual reinforcement learning. The ADP enables swift skill acquisition on new tasks with minimal expert data and limited environment trials, making it ideal for industrial or household robots where extensive data collection is impractical. To apply ADP, collect a small set of demonstration clips, train the diffusion-based policy offline, and deploy the adapter online for in situ fine-tuning. The proposed Meta-World results show the ADP’s consistent gains over standard imitation and residual RL baselines, which is very easy for practitioners in multiple real-world robot scenarios.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"21585-21594\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11174996/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11174996/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
ADP: Adaptive Diffusion Policy Energizes Robots Thinking in Both Learning and Practice
Adaptive control policies for robots often require balancing generalization from large offline datasets with efficient adaptation to specific deployment conditions. In this paper, we propose Adaptive Diffusion Policy (ADP), a two-stage framework that integrates temporal-aware diffusion models with parameter-efficient LoRA adaptation. First, in the learning stage, ADP imitates and generates actions based on image and video signals from a meager amount of expert demonstrations, considering both spatial and temporal information. This component contrasts with most existing works, which focus solely on spatial information. Second, in the practice stage, ADP incorporates a low-rank adaptation module into the policy, subsequently training it using residual reinforcement learning with minimal environment interaction. Experiments conducted on Meta-World benchmark demonstrate the efficiency of each ADP component and the superiority of ADP over representative baseline methods. Note to Practitioners—This work introduces Adaptive Diffusion Policy (ADP), a two-stage visuomotor framework that first learns from just a few image-and-video demonstrations by modeling both spatial and temporal cues, then rapidly refines its behavior via a lightweight low-rank adapter and residual reinforcement learning. The ADP enables swift skill acquisition on new tasks with minimal expert data and limited environment trials, making it ideal for industrial or household robots where extensive data collection is impractical. To apply ADP, collect a small set of demonstration clips, train the diffusion-based policy offline, and deploy the adapter online for in situ fine-tuning. The proposed Meta-World results show the ADP’s consistent gains over standard imitation and residual RL baselines, which is very easy for practitioners in multiple real-world robot scenarios.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.