{"title":"基于对象级空间关系和多任务网络的有限标记样本挖掘场景理解框架","authors":"Dehui Dong , Dongping Ming , Lu Xu , Yue Zhang","doi":"10.1016/j.isprsjprs.2025.06.024","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately delineating mining areas is crucial for monitoring illegal mining activities. Currently, in large-scale and limited labeled sample scenes, mining areas are easily interfered by targets such as clouds, farmland, and roads with similar spectral characteristics, leading to serious misclassification issues. In response to this problem, this paper proposes a mining area understanding framework based on object-level spatial relation constraints, simulating the way that humans interpret mining scenes in remote sensing images. The framework first constructs a Multi-task Network for joint Panoptic Segmentation and Relation Prediction (PSRP-MNet), aiming to achieve high-precision segmentation of mining area scenes and acquisition of explicit object-level spatial relations. The network contained an explicit spatial relation matching module, a lightweight segmentation head, and multi-scale deformable attention to achieve a comprehensive fusion of deep-level features between different tasks and thus realize a rational utilization of multi-scale semantic information. The spatial relation matching module explicitly models and matches the spatial positional relations between targets existing in mining areas, helping the model understand mining areas from the perspective of object-level targets. The lightweight design of the segmentation head maintains high performance while reducing model complexity and parameters. Subsequently, the spatial relations were matched with the prior object-level spatial relation knowledge criteria constructed in this paper, determining the integrated functional structures in the scene to further constrain the segmentation results. The guidance of spatial relations helps PSRP-MNet correct its predictions when errors occur, leading to excellent performance in limited labeled sample tasks. Two sufficiently large scenes were selected as study areas, and approximately 1000 image samples were used for training. Multiple sets of comparative experiments were conducted to validate the framework’s effectiveness and cross-regional generalization ability in limited labeled sample tasks. It was observed that the introduction of spatial relations and the association between different tasks reduced the error accumulation of PSRP-MNet. This research is expected to provide a reference for the regular monitoring of mineral resources.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"227 ","pages":"Pages 383-396"},"PeriodicalIF":10.6000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A mining scene understanding framework with limited labeled samples jointly driven by object-level spatial relationships and multi-task network\",\"authors\":\"Dehui Dong , Dongping Ming , Lu Xu , Yue Zhang\",\"doi\":\"10.1016/j.isprsjprs.2025.06.024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurately delineating mining areas is crucial for monitoring illegal mining activities. Currently, in large-scale and limited labeled sample scenes, mining areas are easily interfered by targets such as clouds, farmland, and roads with similar spectral characteristics, leading to serious misclassification issues. In response to this problem, this paper proposes a mining area understanding framework based on object-level spatial relation constraints, simulating the way that humans interpret mining scenes in remote sensing images. The framework first constructs a Multi-task Network for joint Panoptic Segmentation and Relation Prediction (PSRP-MNet), aiming to achieve high-precision segmentation of mining area scenes and acquisition of explicit object-level spatial relations. The network contained an explicit spatial relation matching module, a lightweight segmentation head, and multi-scale deformable attention to achieve a comprehensive fusion of deep-level features between different tasks and thus realize a rational utilization of multi-scale semantic information. The spatial relation matching module explicitly models and matches the spatial positional relations between targets existing in mining areas, helping the model understand mining areas from the perspective of object-level targets. The lightweight design of the segmentation head maintains high performance while reducing model complexity and parameters. Subsequently, the spatial relations were matched with the prior object-level spatial relation knowledge criteria constructed in this paper, determining the integrated functional structures in the scene to further constrain the segmentation results. The guidance of spatial relations helps PSRP-MNet correct its predictions when errors occur, leading to excellent performance in limited labeled sample tasks. Two sufficiently large scenes were selected as study areas, and approximately 1000 image samples were used for training. Multiple sets of comparative experiments were conducted to validate the framework’s effectiveness and cross-regional generalization ability in limited labeled sample tasks. It was observed that the introduction of spatial relations and the association between different tasks reduced the error accumulation of PSRP-MNet. This research is expected to provide a reference for the regular monitoring of mineral resources.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"227 \",\"pages\":\"Pages 383-396\"},\"PeriodicalIF\":10.6000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271625002503\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625002503","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
A mining scene understanding framework with limited labeled samples jointly driven by object-level spatial relationships and multi-task network
Accurately delineating mining areas is crucial for monitoring illegal mining activities. Currently, in large-scale and limited labeled sample scenes, mining areas are easily interfered by targets such as clouds, farmland, and roads with similar spectral characteristics, leading to serious misclassification issues. In response to this problem, this paper proposes a mining area understanding framework based on object-level spatial relation constraints, simulating the way that humans interpret mining scenes in remote sensing images. The framework first constructs a Multi-task Network for joint Panoptic Segmentation and Relation Prediction (PSRP-MNet), aiming to achieve high-precision segmentation of mining area scenes and acquisition of explicit object-level spatial relations. The network contained an explicit spatial relation matching module, a lightweight segmentation head, and multi-scale deformable attention to achieve a comprehensive fusion of deep-level features between different tasks and thus realize a rational utilization of multi-scale semantic information. The spatial relation matching module explicitly models and matches the spatial positional relations between targets existing in mining areas, helping the model understand mining areas from the perspective of object-level targets. The lightweight design of the segmentation head maintains high performance while reducing model complexity and parameters. Subsequently, the spatial relations were matched with the prior object-level spatial relation knowledge criteria constructed in this paper, determining the integrated functional structures in the scene to further constrain the segmentation results. The guidance of spatial relations helps PSRP-MNet correct its predictions when errors occur, leading to excellent performance in limited labeled sample tasks. Two sufficiently large scenes were selected as study areas, and approximately 1000 image samples were used for training. Multiple sets of comparative experiments were conducted to validate the framework’s effectiveness and cross-regional generalization ability in limited labeled sample tasks. It was observed that the introduction of spatial relations and the association between different tasks reduced the error accumulation of PSRP-MNet. This research is expected to provide a reference for the regular monitoring of mineral resources.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.