SeeMore：具有双向蒸馏和特定级别元适应功能的时空预测模型

IF 7.6 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Science China Information Sciences Pub Date : 2024-07-22 DOI:10.1007/s11432-022-3859-8

Yuqing Ma, Wei Liu, Yajun Gao, Yang Yuan, Shihao Bai, Haotong Qin, Xianglong Liu

{"title":"SeeMore：具有双向蒸馏和特定级别元适应功能的时空预测模型","authors":"Yuqing Ma, Wei Liu, Yajun Gao, Yang Yuan, Shihao Bai, Haotong Qin, Xianglong Liu","doi":"10.1007/s11432-022-3859-8","DOIUrl":null,"url":null,"abstract":"<p>Predicting future frames using historical spatiotemporal data sequences is challenging and critical, and it is receiving a lot of attention these days from academic and industrial scholars. Most spatiotemporal predictive algorithms ignore the valuable backward reasoning ability and the disparate learning complexities among different layers and hence, cannot build good long-term dependencies and spatial correlations, resulting in suboptimal solutions. To address the aforementioned issues, we propose a two-stage coarse-to-fine spatiotemporal predictive model with bidirectional distillation and level-specific meta-adaptation (SeeMore) in this paper, which includes a bidirectional distillation network (BDN) and a level-specific meta-adapter (LMA), to gain bidirectional multilevel reasoning. In the first stage, BDN concentrates on bidirectional dynamics modeling and coarsely constructs spatial correlations of different layers, while LMA is introduced in the second fine-tuning stage to refine the multilevel spatial correlations from a meta-learning perspective. In particular, BDN mimics the forward and backward reasoning abilities of humans in a distillation manner, which aids in the development of long-term dependencies. The LMA views learning of different layers as disparate but related tasks and guides the transfer of learning experiences among these tasks through learning complexities. Thus, each layer could be closer to its solutions and could extract more informative spatial correlations. By capturing the enhanced short-term spatial correlations and long-term temporal dependencies, the proposed model could extract adequate knowledge from sequential historical observations and accurately predict future frames whose backtracking preconditions are consistent with the historical sequence. Our work is general and robust enough to be integrated into most spatiotemporal predictive models without requiring additional computation or memory cost during inference. Extensive experiments on four widely used predictive learning benchmarks validated the proposed model’s effectiveness in comparison to state-of-the-art approaches (e.g., 10.6% improvement of Mean Squared Error on the Moving MNIST dataset).</p>","PeriodicalId":21618,"journal":{"name":"Science China Information Sciences","volume":"41 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SeeMore: a spatiotemporal predictive model with bidirectional distillation and level-specific meta-adaptation\",\"authors\":\"Yuqing Ma, Wei Liu, Yajun Gao, Yang Yuan, Shihao Bai, Haotong Qin, Xianglong Liu\",\"doi\":\"10.1007/s11432-022-3859-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Predicting future frames using historical spatiotemporal data sequences is challenging and critical, and it is receiving a lot of attention these days from academic and industrial scholars. Most spatiotemporal predictive algorithms ignore the valuable backward reasoning ability and the disparate learning complexities among different layers and hence, cannot build good long-term dependencies and spatial correlations, resulting in suboptimal solutions. To address the aforementioned issues, we propose a two-stage coarse-to-fine spatiotemporal predictive model with bidirectional distillation and level-specific meta-adaptation (SeeMore) in this paper, which includes a bidirectional distillation network (BDN) and a level-specific meta-adapter (LMA), to gain bidirectional multilevel reasoning. In the first stage, BDN concentrates on bidirectional dynamics modeling and coarsely constructs spatial correlations of different layers, while LMA is introduced in the second fine-tuning stage to refine the multilevel spatial correlations from a meta-learning perspective. In particular, BDN mimics the forward and backward reasoning abilities of humans in a distillation manner, which aids in the development of long-term dependencies. The LMA views learning of different layers as disparate but related tasks and guides the transfer of learning experiences among these tasks through learning complexities. Thus, each layer could be closer to its solutions and could extract more informative spatial correlations. By capturing the enhanced short-term spatial correlations and long-term temporal dependencies, the proposed model could extract adequate knowledge from sequential historical observations and accurately predict future frames whose backtracking preconditions are consistent with the historical sequence. Our work is general and robust enough to be integrated into most spatiotemporal predictive models without requiring additional computation or memory cost during inference. Extensive experiments on four widely used predictive learning benchmarks validated the proposed model’s effectiveness in comparison to state-of-the-art approaches (e.g., 10.6% improvement of Mean Squared Error on the Moving MNIST dataset).</p>\",\"PeriodicalId\":21618,\"journal\":{\"name\":\"Science China Information Sciences\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science China Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11432-022-3859-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science China Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11432-022-3859-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

利用历史时空数据序列预测未来帧是一项极具挑战性的重要工作，近年来受到学术界和工业界学者的广泛关注。大多数时空预测算法都忽视了宝贵的后向推理能力和不同层之间的学习复杂性，因此无法建立良好的长期依赖关系和空间相关性，导致解决方案不尽人意。针对上述问题，我们在本文中提出了一种具有双向蒸馏和特定层元适配（SeeMore）的两阶段粗到细时空预测模型，其中包括一个双向蒸馏网络（BDN）和一个特定层元适配器（LMA），以获得双向多层次推理。在第一阶段，BDN 专注于双向动力学建模，粗略构建不同层次的空间相关性，而 LMA 则在第二阶段引入微调，从元学习的角度完善多层次空间相关性。其中，BDN 以提炼的方式模仿了人类的前向和后向推理能力，有助于发展长期依赖关系。LMA将不同层次的学习视为不同但相关的任务，并通过学习复杂性引导学习经验在这些任务之间转移。因此，每一层都能更接近其解决方案，并能提取更多的空间关联信息。通过捕捉增强的短期空间相关性和长期时间依赖性，所提出的模型可以从连续的历史观测中提取足够的知识，并准确预测未来帧的回溯前提条件与历史序列一致。我们的工作具有足够的通用性和鲁棒性，可以集成到大多数时空预测模型中，在推理过程中无需额外的计算或内存成本。在四个广泛使用的预测学习基准上进行的大量实验验证了所提出模型与最先进方法相比的有效性（例如，在移动 MNIST 数据集上平均平方误差提高了 10.6%）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SeeMore: a spatiotemporal predictive model with bidirectional distillation and level-specific meta-adaptation

Predicting future frames using historical spatiotemporal data sequences is challenging and critical, and it is receiving a lot of attention these days from academic and industrial scholars. Most spatiotemporal predictive algorithms ignore the valuable backward reasoning ability and the disparate learning complexities among different layers and hence, cannot build good long-term dependencies and spatial correlations, resulting in suboptimal solutions. To address the aforementioned issues, we propose a two-stage coarse-to-fine spatiotemporal predictive model with bidirectional distillation and level-specific meta-adaptation (SeeMore) in this paper, which includes a bidirectional distillation network (BDN) and a level-specific meta-adapter (LMA), to gain bidirectional multilevel reasoning. In the first stage, BDN concentrates on bidirectional dynamics modeling and coarsely constructs spatial correlations of different layers, while LMA is introduced in the second fine-tuning stage to refine the multilevel spatial correlations from a meta-learning perspective. In particular, BDN mimics the forward and backward reasoning abilities of humans in a distillation manner, which aids in the development of long-term dependencies. The LMA views learning of different layers as disparate but related tasks and guides the transfer of learning experiences among these tasks through learning complexities. Thus, each layer could be closer to its solutions and could extract more informative spatial correlations. By capturing the enhanced short-term spatial correlations and long-term temporal dependencies, the proposed model could extract adequate knowledge from sequential historical observations and accurately predict future frames whose backtracking preconditions are consistent with the historical sequence. Our work is general and robust enough to be integrated into most spatiotemporal predictive models without requiring additional computation or memory cost during inference. Extensive experiments on four widely used predictive learning benchmarks validated the proposed model’s effectiveness in comparison to state-of-the-art approaches (e.g., 10.6% improvement of Mean Squared Error on the Moving MNIST dataset).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Science China Information Sciences COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

12.60

自引率

5.70%

发文量

224

审稿时长

8.3 months

期刊介绍： Science China Information Sciences is a dedicated journal that showcases high-quality, original research across various domains of information sciences. It encompasses Computer Science & Technologies, Control Science & Engineering, Information & Communication Engineering, Microelectronics & Solid-State Electronics, and Quantum Information, providing a platform for the dissemination of significant contributions in these fields.