多模态空气质量预测:基于共享特定模态特征解耦的多模态特征融合网络

IF 4.6 2区 环境科学与生态学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Xiaoxia Chen , Zhen Wang , Fangyan Dong , Kaoru Hirota
{"title":"多模态空气质量预测:基于共享特定模态特征解耦的多模态特征融合网络","authors":"Xiaoxia Chen ,&nbsp;Zhen Wang ,&nbsp;Fangyan Dong ,&nbsp;Kaoru Hirota","doi":"10.1016/j.envsoft.2025.106553","DOIUrl":null,"url":null,"abstract":"<div><div>Severe air pollution degrades air quality and threatens human health, necessitating accurate prediction for pollution control. While spatiotemporal networks integrating sequence models and graph structures dominate current methods, prior work neglects multimodal data fusion to enhance feature representation. This study addresses the spatial limitations of single-perspective ground monitoring by synergizing remote sensing data, which provides global air quality distribution, with ground observations. We propose a Shared-Specific Modality Decoupling-based Spatiotemporal Multimodal Fusion Network for air-quality prediction, comprising: (1) feature extractors for remote sensing images and ground monitoring data, (2) a decoupling module separating shared and modality-specific features, and (3) a hierarchical attention-graph convolution fusion module. This framework achieves effective multimodal fusion by disentangling cross-modal dependencies while preserving unique characteristics. Evaluations on two real-world datasets demonstrate superior performance over baseline models, validating the efficacy of multimodal integration for spatial–temporal air quality forecasting.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"192 ","pages":"Article 106553"},"PeriodicalIF":4.6000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal air-quality prediction: A multimodal feature fusion network based on shared-specific modal feature decoupling\",\"authors\":\"Xiaoxia Chen ,&nbsp;Zhen Wang ,&nbsp;Fangyan Dong ,&nbsp;Kaoru Hirota\",\"doi\":\"10.1016/j.envsoft.2025.106553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Severe air pollution degrades air quality and threatens human health, necessitating accurate prediction for pollution control. While spatiotemporal networks integrating sequence models and graph structures dominate current methods, prior work neglects multimodal data fusion to enhance feature representation. This study addresses the spatial limitations of single-perspective ground monitoring by synergizing remote sensing data, which provides global air quality distribution, with ground observations. We propose a Shared-Specific Modality Decoupling-based Spatiotemporal Multimodal Fusion Network for air-quality prediction, comprising: (1) feature extractors for remote sensing images and ground monitoring data, (2) a decoupling module separating shared and modality-specific features, and (3) a hierarchical attention-graph convolution fusion module. This framework achieves effective multimodal fusion by disentangling cross-modal dependencies while preserving unique characteristics. Evaluations on two real-world datasets demonstrate superior performance over baseline models, validating the efficacy of multimodal integration for spatial–temporal air quality forecasting.</div></div>\",\"PeriodicalId\":310,\"journal\":{\"name\":\"Environmental Modelling & Software\",\"volume\":\"192 \",\"pages\":\"Article 106553\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Modelling & Software\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1364815225002373\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225002373","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

严重的空气污染使空气质量恶化,威胁人类健康,需要准确的预测来控制污染。虽然整合序列模型和图结构的时空网络在当前方法中占主导地位,但先前的工作忽略了多模态数据融合来增强特征表示。本研究通过将提供全球空气质量分布的遥感数据与地面观测数据协同,解决了单视角地面监测的空间局限性。我们提出了一个基于共享特定模态解耦的时空多模态融合网络,用于空气质量预测,包括:(1)遥感图像和地面监测数据的特征提取器,(2)分离共享和特定模态特征的解耦模块,以及(3)分层注意力图卷积融合模块。该框架通过消除跨模态依赖关系实现了有效的多模态融合,同时保留了独特的特征。对两个真实世界数据集的评估表明,该方法优于基线模型,验证了多模式集成在时空空气质量预测中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Multimodal air-quality prediction: A multimodal feature fusion network based on shared-specific modal feature decoupling

Multimodal air-quality prediction: A multimodal feature fusion network based on shared-specific modal feature decoupling
Severe air pollution degrades air quality and threatens human health, necessitating accurate prediction for pollution control. While spatiotemporal networks integrating sequence models and graph structures dominate current methods, prior work neglects multimodal data fusion to enhance feature representation. This study addresses the spatial limitations of single-perspective ground monitoring by synergizing remote sensing data, which provides global air quality distribution, with ground observations. We propose a Shared-Specific Modality Decoupling-based Spatiotemporal Multimodal Fusion Network for air-quality prediction, comprising: (1) feature extractors for remote sensing images and ground monitoring data, (2) a decoupling module separating shared and modality-specific features, and (3) a hierarchical attention-graph convolution fusion module. This framework achieves effective multimodal fusion by disentangling cross-modal dependencies while preserving unique characteristics. Evaluations on two real-world datasets demonstrate superior performance over baseline models, validating the efficacy of multimodal integration for spatial–temporal air quality forecasting.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Environmental Modelling & Software
Environmental Modelling & Software 工程技术-工程:环境
CiteScore
9.30
自引率
8.20%
发文量
241
审稿时长
60 days
期刊介绍: Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信