Hao Sheng , Rongshan Chen , Ruixuan Cong , Da Yang , Zhenglong Cui , Sizhe Wang
{"title":"通过距离约束查询机制从光场学习深度空间","authors":"Hao Sheng , Rongshan Chen , Ruixuan Cong , Da Yang , Zhenglong Cui , Sizhe Wang","doi":"10.1016/j.patcog.2025.112403","DOIUrl":null,"url":null,"abstract":"<div><div>The Light Field (LF) captures both spatial and angular information of scenes, enabling precise depth estimation. Recent advancements in deep learning have led to significant success in this field; however, existing methods primarily focus on modeling surface characteristics (e.g., depth maps) while overlooking the depth space, which contains additional valuable information. The depth space consists of numerous space points and provides substantially more geometric data than a single depth map. In this paper, we conceptualize depth prediction as a spatial modeling problem, aiming to learn the entire depth space rather than merely a single depth map. Specifically, we define space points as signed distances relative to the scene surface and propose a novel distance-constraint query mechanism for LF depth estimation. To model the depth space effectively, we first develop a mixed sampling strategy to approximate its data representation. Subsequently, we introduce an encoder-decoder network architecture to query the distances of each point, thereby implicitly embedding the depth space. Finally, to extract the target depth map from this space, we present a generation algorithm that iteratively invokes the decoder network. Through extensive experiments, our approach achieves the highest performance on LF depth estimation benchmarks, and also demonstrates superior performance on various synthetic and real-world scenes.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112403"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learn depth space from light field via a distance-constraint query mechanism\",\"authors\":\"Hao Sheng , Rongshan Chen , Ruixuan Cong , Da Yang , Zhenglong Cui , Sizhe Wang\",\"doi\":\"10.1016/j.patcog.2025.112403\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The Light Field (LF) captures both spatial and angular information of scenes, enabling precise depth estimation. Recent advancements in deep learning have led to significant success in this field; however, existing methods primarily focus on modeling surface characteristics (e.g., depth maps) while overlooking the depth space, which contains additional valuable information. The depth space consists of numerous space points and provides substantially more geometric data than a single depth map. In this paper, we conceptualize depth prediction as a spatial modeling problem, aiming to learn the entire depth space rather than merely a single depth map. Specifically, we define space points as signed distances relative to the scene surface and propose a novel distance-constraint query mechanism for LF depth estimation. To model the depth space effectively, we first develop a mixed sampling strategy to approximate its data representation. Subsequently, we introduce an encoder-decoder network architecture to query the distances of each point, thereby implicitly embedding the depth space. Finally, to extract the target depth map from this space, we present a generation algorithm that iteratively invokes the decoder network. Through extensive experiments, our approach achieves the highest performance on LF depth estimation benchmarks, and also demonstrates superior performance on various synthetic and real-world scenes.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112403\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325010647\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325010647","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Learn depth space from light field via a distance-constraint query mechanism
The Light Field (LF) captures both spatial and angular information of scenes, enabling precise depth estimation. Recent advancements in deep learning have led to significant success in this field; however, existing methods primarily focus on modeling surface characteristics (e.g., depth maps) while overlooking the depth space, which contains additional valuable information. The depth space consists of numerous space points and provides substantially more geometric data than a single depth map. In this paper, we conceptualize depth prediction as a spatial modeling problem, aiming to learn the entire depth space rather than merely a single depth map. Specifically, we define space points as signed distances relative to the scene surface and propose a novel distance-constraint query mechanism for LF depth estimation. To model the depth space effectively, we first develop a mixed sampling strategy to approximate its data representation. Subsequently, we introduce an encoder-decoder network architecture to query the distances of each point, thereby implicitly embedding the depth space. Finally, to extract the target depth map from this space, we present a generation algorithm that iteratively invokes the decoder network. Through extensive experiments, our approach achieves the highest performance on LF depth estimation benchmarks, and also demonstrates superior performance on various synthetic and real-world scenes.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.