Diversity-Driven Contrastive Value Ensembles with Categorical Constraints for Goal-Conditioned Robotic Control

IF 19.2 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Ieee-Caa Journal of Automatica Sinica Pub Date : 2026-04-01 Epub Date: 2026-04-30 DOI:10.1109/JAS.2025.125885

Zhiyi Shi;Ruihao Zhu;Shuai Wu;Wei Tong;Guangyu Zhu;Edmond Q. Wu

{"title":"Diversity-Driven Contrastive Value Ensembles with Categorical Constraints for Goal-Conditioned Robotic Control","authors":"Zhiyi Shi;Ruihao Zhu;Shuai Wu;Wei Tong;Guangyu Zhu;Edmond Q. Wu","doi":"10.1109/JAS.2025.125885","DOIUrl":null,"url":null,"abstract":"Dear Editor, This letter presents a contrastive reinforcement learning (Contrastive RL)-based framework, addressing challenging goal-conditioned problems in robotic control. While Contrastive RL offers promise in learning state-action-goal relationships, it suffers from a critical limitation: Insufficient discriminability between positive and negative samples attributed to inefficient value exploration and model overfitting. To overcome these challenges, the proposed algorithm extends Contrastive RL by leveraging an ensemble of critic networks to model state-action-goal alignment, alleviating the overfitting problem. Furthermore, the architecture introduces a dual component loss function: 1) A diversity-driven term to mitigate exploration redundancy in value estimation; and 2) A categorical-guidance constraint to ensure the discriminability capacity across contrasting pairs. We term this integrated framework diversity-driven contrastive value ensembles with categorical constraints (DiCE-CC). Experimental validation across three robotic manipulation scenarios demonstrates the effectiveness of the proposed algorithm in solving complex goal-conditioned control problems.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"13 4","pages":"1001-1003"},"PeriodicalIF":19.2000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11503199","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ieee-Caa Journal of Automatica Sinica","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11503199/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/4/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Dear Editor, This letter presents a contrastive reinforcement learning (Contrastive RL)-based framework, addressing challenging goal-conditioned problems in robotic control. While Contrastive RL offers promise in learning state-action-goal relationships, it suffers from a critical limitation: Insufficient discriminability between positive and negative samples attributed to inefficient value exploration and model overfitting. To overcome these challenges, the proposed algorithm extends Contrastive RL by leveraging an ensemble of critic networks to model state-action-goal alignment, alleviating the overfitting problem. Furthermore, the architecture introduces a dual component loss function: 1) A diversity-driven term to mitigate exploration redundancy in value estimation; and 2) A categorical-guidance constraint to ensure the discriminability capacity across contrasting pairs. We term this integrated framework diversity-driven contrastive value ensembles with categorical constraints (DiCE-CC). Experimental validation across three robotic manipulation scenarios demonstrates the effectiveness of the proposed algorithm in solving complex goal-conditioned control problems.

查看原文本刊更多论文

目标条件机器人控制的分类约束的多样性驱动对比值集成

这封信提出了一个基于对比强化学习（对比RL）的框架，解决机器人控制中具有挑战性的目标条件问题。虽然对比强化学习在学习状态-行动-目标关系方面提供了希望，但它存在一个关键的局限性：由于低效的价值探索和模型过拟合，正负样本之间的可辨析性不足。为了克服这些挑战，本文提出的算法扩展了对比强化学习，利用一组批评网络来模拟状态-行动-目标对齐，从而缓解了过拟合问题。此外，该体系结构引入了双分量损失函数：1)多样性驱动项，以减轻价值估计中的勘探冗余；2)一个分类引导约束，以确保对比对之间的区分能力。我们将这种集成框架称为具有分类约束的多样性驱动对比价值集成（DiCE-CC）。通过三种机器人操作场景的实验验证，证明了该算法在解决复杂目标条件控制问题方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ieee-Caa Journal of Automatica Sinica Engineering-Control and Systems Engineering

CiteScore

23.50

自引率

11.00%

发文量

880

期刊介绍： The IEEE/CAA Journal of Automatica Sinica is a reputable journal that publishes high-quality papers in English on original theoretical/experimental research and development in the field of automation. The journal covers a wide range of topics including automatic control, artificial intelligence and intelligent control, systems theory and engineering, pattern recognition and intelligent systems, automation engineering and applications, information processing and information systems, network-based automation, robotics, sensing and measurement, and navigation, guidance, and control. Additionally, the journal is abstracted/indexed in several prominent databases including SCIE (Science Citation Index Expanded), EI (Engineering Index), Inspec, Scopus, SCImago, DBLP, CNKI (China National Knowledge Infrastructure), CSCD (Chinese Science Citation Database), and IEEE Xplore.