{"title":"多模态 PDE 基础模型中的时间序列预测、知识提炼和完善","authors":"Derek Jollie, Jingmin Sun, Zecheng Zhang, Hayden Schaeffer","doi":"arxiv-2409.11609","DOIUrl":null,"url":null,"abstract":"Symbolic encoding has been used in multi-operator learning as a way to embed\nadditional information for distinct time-series data. For spatiotemporal\nsystems described by time-dependent partial differential equations, the\nequation itself provides an additional modality to identify the system. The\nutilization of symbolic expressions along side time-series samples allows for\nthe development of multimodal predictive neural networks. A key challenge with\ncurrent approaches is that the symbolic information, i.e. the equations, must\nbe manually preprocessed (simplified, rearranged, etc.) to match and relate to\nthe existing token library, which increases costs and reduces flexibility,\nespecially when dealing with new differential equations. We propose a new token\nlibrary based on SymPy to encode differential equations as an additional\nmodality for time-series models. The proposed approach incurs minimal cost, is\nautomated, and maintains high prediction accuracy for forecasting tasks.\nAdditionally, we include a Bayesian filtering module that connects the\ndifferent modalities to refine the learned equation. This improves the accuracy\nof the learned symbolic representation and the predicted time-series.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"94 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model\",\"authors\":\"Derek Jollie, Jingmin Sun, Zecheng Zhang, Hayden Schaeffer\",\"doi\":\"arxiv-2409.11609\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Symbolic encoding has been used in multi-operator learning as a way to embed\\nadditional information for distinct time-series data. For spatiotemporal\\nsystems described by time-dependent partial differential equations, the\\nequation itself provides an additional modality to identify the system. The\\nutilization of symbolic expressions along side time-series samples allows for\\nthe development of multimodal predictive neural networks. A key challenge with\\ncurrent approaches is that the symbolic information, i.e. the equations, must\\nbe manually preprocessed (simplified, rearranged, etc.) to match and relate to\\nthe existing token library, which increases costs and reduces flexibility,\\nespecially when dealing with new differential equations. We propose a new token\\nlibrary based on SymPy to encode differential equations as an additional\\nmodality for time-series models. The proposed approach incurs minimal cost, is\\nautomated, and maintains high prediction accuracy for forecasting tasks.\\nAdditionally, we include a Bayesian filtering module that connects the\\ndifferent modalities to refine the learned equation. This improves the accuracy\\nof the learned symbolic representation and the predicted time-series.\",\"PeriodicalId\":501301,\"journal\":{\"name\":\"arXiv - CS - Machine Learning\",\"volume\":\"94 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11609\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model
Symbolic encoding has been used in multi-operator learning as a way to embed
additional information for distinct time-series data. For spatiotemporal
systems described by time-dependent partial differential equations, the
equation itself provides an additional modality to identify the system. The
utilization of symbolic expressions along side time-series samples allows for
the development of multimodal predictive neural networks. A key challenge with
current approaches is that the symbolic information, i.e. the equations, must
be manually preprocessed (simplified, rearranged, etc.) to match and relate to
the existing token library, which increases costs and reduces flexibility,
especially when dealing with new differential equations. We propose a new token
library based on SymPy to encode differential equations as an additional
modality for time-series models. The proposed approach incurs minimal cost, is
automated, and maintains high prediction accuracy for forecasting tasks.
Additionally, we include a Bayesian filtering module that connects the
different modalities to refine the learned equation. This improves the accuracy
of the learned symbolic representation and the predicted time-series.