Contrastive-Learning-Based Decision Making for Dynamic Time-Linkage Optimization

IF 8.7 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Xiao-Fang Liu;Meng Gao;Yongchun Fang;Zhi-Hui Zhan;Jun Zhang
{"title":"Contrastive-Learning-Based Decision Making for Dynamic Time-Linkage Optimization","authors":"Xiao-Fang Liu;Meng Gao;Yongchun Fang;Zhi-Hui Zhan;Jun Zhang","doi":"10.1109/TSMC.2025.3611797","DOIUrl":null,"url":null,"abstract":"In dynamic time-linkage optimization, current decisions influence the future state of environments. To make good decisions that have a positive impact on future states, existing methods usually build a model to predict the future rewards of solutions for decision making. However, these prediction models present low accuracy since decision data are not enough to train such a complex model. To address this issue, this article proposes a contrastive-learning-based decision making (CLDM) method, which builds a contrastive model to learn the relationship between solutions but not absolute rewards and adopts a quick decision strategy to select solutions. In CLDM, a clustering-based time-linkage detection (CD) strategy is developed to measure the intensity of the time linkage, which determines whether to make decisions based on future rewards. To represent the relative relationship between solutions, a large number of contrastive samples are constructed using the limited historical decisions. A contrastive model is trained for solution comparison in terms of the combination of current fitness and future rewards. Candidate solutions are clustered into multiple groups to filter poor ones, and a few solutions are preserved to rank using the contrastive model. The winner is taken as the decision solution. Integrating CLDM into particle swarm optimization (PSO), a new algorithm named contrastive-learning-based PSO (CL-PSO) is put forward. Experimental results on multiple dynamic time-linkage optimization instances demonstrate that CL-PSO outperforms state-of-the-art algorithms in terms of solution quality. CL-PSO can also well solve the mobile robot path planning problem.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8661-8674"},"PeriodicalIF":8.7000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11176976/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In dynamic time-linkage optimization, current decisions influence the future state of environments. To make good decisions that have a positive impact on future states, existing methods usually build a model to predict the future rewards of solutions for decision making. However, these prediction models present low accuracy since decision data are not enough to train such a complex model. To address this issue, this article proposes a contrastive-learning-based decision making (CLDM) method, which builds a contrastive model to learn the relationship between solutions but not absolute rewards and adopts a quick decision strategy to select solutions. In CLDM, a clustering-based time-linkage detection (CD) strategy is developed to measure the intensity of the time linkage, which determines whether to make decisions based on future rewards. To represent the relative relationship between solutions, a large number of contrastive samples are constructed using the limited historical decisions. A contrastive model is trained for solution comparison in terms of the combination of current fitness and future rewards. Candidate solutions are clustered into multiple groups to filter poor ones, and a few solutions are preserved to rank using the contrastive model. The winner is taken as the decision solution. Integrating CLDM into particle swarm optimization (PSO), a new algorithm named contrastive-learning-based PSO (CL-PSO) is put forward. Experimental results on multiple dynamic time-linkage optimization instances demonstrate that CL-PSO outperforms state-of-the-art algorithms in terms of solution quality. CL-PSO can also well solve the mobile robot path planning problem.
基于对比学习的动态时间链优化决策
在动态时间链优化中,当前决策影响环境的未来状态。为了做出对未来状态有积极影响的好决策,现有的方法通常会建立一个模型来预测决策解决方案的未来回报。然而,由于决策数据不足以训练如此复杂的模型,这些预测模型呈现出较低的准确性。针对这一问题,本文提出了一种基于对比学习的决策方法(CLDM),该方法通过建立对比模型来学习解决方案之间的关系,而不是绝对奖励,并采用快速决策策略来选择解决方案。在CLDM中,开发了一种基于聚类的时间链接检测策略(CD)来测量时间链接的强度,从而决定是否根据未来奖励做出决策。为了表示解之间的相对关系,使用有限的历史决策构造了大量的对比样本。根据当前适应度和未来奖励的组合,训练了一个对比模型来进行解决方案的比较。候选解决方案聚类成多组以过滤差的解决方案,并保留一些解决方案使用对比模型进行排名。取优胜者作为决策解。将CLDM算法与粒子群优化算法相结合,提出了一种基于对比学习的粒子群优化算法。在多个动态时间链优化实例上的实验结果表明,CL-PSO在求解质量上优于现有算法。CL-PSO还能很好地解决移动机器人路径规划问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Systems Man Cybernetics-Systems
IEEE Transactions on Systems Man Cybernetics-Systems AUTOMATION & CONTROL SYSTEMS-COMPUTER SCIENCE, CYBERNETICS
CiteScore
18.50
自引率
11.50%
发文量
812
审稿时长
6 months
期刊介绍: The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信