多视图学习满足状态空间模型：动态系统视角。

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-09-09 DOI:10.1016/j.neunet.2025.108088

Weibin Chen , Ying Zou , Zhiyong Xu , Li Xu , Shiping Wang

{"title":"多视图学习满足状态空间模型：动态系统视角。","authors":"Weibin Chen , Ying Zou , Zhiyong Xu , Li Xu , Shiping Wang","doi":"10.1016/j.neunet.2025.108088","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-view learning exploits the complementary nature of multiple modalities to enhance performance across diverse tasks. While deep learning has significantly advanced these fields by enabling sophisticated modeling of intra-view and cross-view interactions, many existing approaches still rely on heuristic architectures and lack a principled framework to capture the dynamic evolution of feature representations. This limitation hampers interpretability and theoretical understanding. To address these challenges, this paper introduces the Multi-view State-Space Model (MvSSM), which formulates multi-view representation learning as a continuous-time dynamical system inspired by control theory. In this framework, view-specific features are treated as external inputs, and a shared latent representation evolves as the internal system state, driven by learnable dynamics. This formulation unifies feature integration and label prediction within a single interpretable model, enabling theoretical analysis of system stability and representational transitions. Two variants, MvSSM-Lap and MvSSM-iLap, are further developed using Laplace and inverse Laplace transformations to derive system dynamics representations. These solutions exhibit structural similarities to graph convolution operations in deep networks, supporting efficient feature propagation and theoretical interpretability. Experiments on benchmark datasets such as IAPR-TC12, and ESP demonstrate the effectiveness of the proposed method, achieving up to 4.31 % improvement in accuracy and 4.27 % in F1-score over existing state-of-the-art approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108088"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-view learning meets state-space model: A dynamical system perspective\",\"authors\":\"Weibin Chen , Ying Zou , Zhiyong Xu , Li Xu , Shiping Wang\",\"doi\":\"10.1016/j.neunet.2025.108088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-view learning exploits the complementary nature of multiple modalities to enhance performance across diverse tasks. While deep learning has significantly advanced these fields by enabling sophisticated modeling of intra-view and cross-view interactions, many existing approaches still rely on heuristic architectures and lack a principled framework to capture the dynamic evolution of feature representations. This limitation hampers interpretability and theoretical understanding. To address these challenges, this paper introduces the Multi-view State-Space Model (MvSSM), which formulates multi-view representation learning as a continuous-time dynamical system inspired by control theory. In this framework, view-specific features are treated as external inputs, and a shared latent representation evolves as the internal system state, driven by learnable dynamics. This formulation unifies feature integration and label prediction within a single interpretable model, enabling theoretical analysis of system stability and representational transitions. Two variants, MvSSM-Lap and MvSSM-iLap, are further developed using Laplace and inverse Laplace transformations to derive system dynamics representations. These solutions exhibit structural similarities to graph convolution operations in deep networks, supporting efficient feature propagation and theoretical interpretability. Experiments on benchmark datasets such as IAPR-TC12, and ESP demonstrate the effectiveness of the proposed method, achieving up to 4.31 % improvement in accuracy and 4.27 % in F1-score over existing state-of-the-art approaches.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"194 \",\"pages\":\"Article 108088\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025009682\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025009682","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多视图学习利用多种模式的互补性来提高跨不同任务的表现。虽然深度学习通过实现视图内和跨视图交互的复杂建模，显著地推动了这些领域的发展，但许多现有的方法仍然依赖于启发式架构，缺乏原则性框架来捕捉特征表示的动态演变。这种限制阻碍了可解释性和理论理解。为了解决这些挑战，本文引入了多视图状态空间模型（MvSSM），该模型将多视图表示学习描述为受控制理论启发的连续时间动态系统。在这个框架中，特定于视图的特征被视为外部输入，共享的潜在表示演变为内部系统状态，由可学习的动态驱动。该公式将特征集成和标签预测统一在一个可解释的模型中，从而可以对系统稳定性和表示转换进行理论分析。两个变体，mvsm - lap和mvsm - ilap，进一步发展使用拉普拉斯变换和逆变换，以获得系统动力学表示。这些解决方案在结构上与深度网络中的图卷积操作相似，支持有效的特征传播和理论可解释性。在IAPR-TC12和ESP等基准数据集上的实验证明了该方法的有效性，与现有的最先进方法相比，准确率提高了4.31%，f1分数提高了4.27%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-view learning meets state-space model: A dynamical system perspective

Multi-view learning exploits the complementary nature of multiple modalities to enhance performance across diverse tasks. While deep learning has significantly advanced these fields by enabling sophisticated modeling of intra-view and cross-view interactions, many existing approaches still rely on heuristic architectures and lack a principled framework to capture the dynamic evolution of feature representations. This limitation hampers interpretability and theoretical understanding. To address these challenges, this paper introduces the Multi-view State-Space Model (MvSSM), which formulates multi-view representation learning as a continuous-time dynamical system inspired by control theory. In this framework, view-specific features are treated as external inputs, and a shared latent representation evolves as the internal system state, driven by learnable dynamics. This formulation unifies feature integration and label prediction within a single interpretable model, enabling theoretical analysis of system stability and representational transitions. Two variants, MvSSM-Lap and MvSSM-iLap, are further developed using Laplace and inverse Laplace transformations to derive system dynamics representations. These solutions exhibit structural similarities to graph convolution operations in deep networks, supporting efficient feature propagation and theoretical interpretability. Experiments on benchmark datasets such as IAPR-TC12, and ESP demonstrate the effectiveness of the proposed method, achieving up to 4.31 % improvement in accuracy and 4.27 % in F1-score over existing state-of-the-art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.