Di Song , Hong Sun , Esther Ngumbi , Mohammed Kamruzzaman
{"title":"Multispectral image reconstruction from RGB image for maize growth status monitoring based on window-adaptive spatial-spectral attention transformer","authors":"Di Song , Hong Sun , Esther Ngumbi , Mohammed Kamruzzaman","doi":"10.1016/j.compag.2025.111062","DOIUrl":null,"url":null,"abstract":"<div><div>Multispectral image analysis is an effective way to detect crop growth status. However, the complexity of manufacturing process and technology of multispectral image acquisition equipment make data acquisition expensive. Therefore, a method based on a window-adaptive spatial-spectral attention transformer is proposed to reconstruct multispectral images using RGB images of maize. First, RGB and hyperspectral images of the maize are obtained, and the reflectance data from classic and preferred band combinations are extracted from the hyperspectral image. Then, a transformer model is constructed to evaluate and compare the reconstruction efficacy of the 5-band and 10-band combinations across four attention modes: spatial, spectral, spatial-spectral, and window-adaptive spatial-spectral attention. The best-performing reconstruction results are selected and compared with the original data from three perspectives: image, spectrum, and model effect. The 10-band multispectral image reconstructed by the window-adaptive spatial-spectral attention mechanism is highly similar to the original image, with a reflectance correlation exceeding 0.99. Furthermore, its application in monitoring crop growth status (i.e., maize chlorophyll) yields results closely aligned with actual reflectance data: R<sub>C</sub><sup>2</sup> is 0.76, R<sub>V</sub><sup>2</sup> is 0.64, while RMSE<sub>C</sub> and RMSE<sub>V</sub> are 3.63 mg/L and 2.94 mg/L, respectively. To further explore the model performance, the new sensitive bands are selected to be reconstructed in the maize V7 stage. The results from the chlorophyll content prediction model are as: R<sub>C</sub><sup>2</sup> is 0.64, R<sub>V</sub><sup>2</sup> is 0.60, with RMSE<sub>C</sub> and RMSE<sub>V</sub> are 5.61 mg/L and 5.62 mg/L, respectively. Therefore, the window-adaptive spatial-spectral attention transformer can accurately reconstruct multispectral images and establish precise growth status monitoring models, providing technical support for low-cost field maize growth detection.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"239 ","pages":"Article 111062"},"PeriodicalIF":8.9000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925011688","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Multispectral image analysis is an effective way to detect crop growth status. However, the complexity of manufacturing process and technology of multispectral image acquisition equipment make data acquisition expensive. Therefore, a method based on a window-adaptive spatial-spectral attention transformer is proposed to reconstruct multispectral images using RGB images of maize. First, RGB and hyperspectral images of the maize are obtained, and the reflectance data from classic and preferred band combinations are extracted from the hyperspectral image. Then, a transformer model is constructed to evaluate and compare the reconstruction efficacy of the 5-band and 10-band combinations across four attention modes: spatial, spectral, spatial-spectral, and window-adaptive spatial-spectral attention. The best-performing reconstruction results are selected and compared with the original data from three perspectives: image, spectrum, and model effect. The 10-band multispectral image reconstructed by the window-adaptive spatial-spectral attention mechanism is highly similar to the original image, with a reflectance correlation exceeding 0.99. Furthermore, its application in monitoring crop growth status (i.e., maize chlorophyll) yields results closely aligned with actual reflectance data: RC2 is 0.76, RV2 is 0.64, while RMSEC and RMSEV are 3.63 mg/L and 2.94 mg/L, respectively. To further explore the model performance, the new sensitive bands are selected to be reconstructed in the maize V7 stage. The results from the chlorophyll content prediction model are as: RC2 is 0.64, RV2 is 0.60, with RMSEC and RMSEV are 5.61 mg/L and 5.62 mg/L, respectively. Therefore, the window-adaptive spatial-spectral attention transformer can accurately reconstruct multispectral images and establish precise growth status monitoring models, providing technical support for low-cost field maize growth detection.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.