DisplaysPub Date : 2025-05-27DOI: 10.1016/j.displa.2025.103086
Xiongzhi Wang , Boyu Yang , Min Wei , Liangfa Liu , Jingang Zhang , Yunfeng Nie
{"title":"Deep learning for endoscopic depth estimation: A review","authors":"Xiongzhi Wang , Boyu Yang , Min Wei , Liangfa Liu , Jingang Zhang , Yunfeng Nie","doi":"10.1016/j.displa.2025.103086","DOIUrl":"10.1016/j.displa.2025.103086","url":null,"abstract":"<div><div>Depth estimation is a fundamental task in computer vision, crucial for applications such as endoscopic surgical navigation. This paper comprehensively reviews recent advancements in endoscopic depth estimation algorithms utilizing deep learning. We start by briefly describing the basic principles behind depth estimation and how depth maps can be generated from monocular and binocular cues. We then analyze the characteristics of the endoscopic dataset. Subsequently, we provide an overview of deep learning applications in endoscopic depth estimation, encompassing supervised, self-supervised, and semi-supervised learning methods. We examine each method’s principles, advantages, and disadvantages and their performance in practical applications. Additionally, we summarize the performance of current deep learning methods in endoscopic depth estimation and explore the importance of model robustness and generalization capabilities. Finally, we propose potential future research directions, such as exploring methods for collecting high-quality data or using simulated data to overcome current dataset limitations, and developing lightweight models to enhance real-time performance and robustness. This study aims to offer a comprehensive review for researchers in the field of endoscopic depth estimation, thereby fostering further development in this area.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103086"},"PeriodicalIF":3.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144168613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Blind image quality assessment via aesthetic dataset construction and hierarchical feature fusion","authors":"Weifeng Dong, Haibing Yin, Shiling Zhao, Ruiyu Ming, Xia Wang, Xiaofeng Huang, Hongkui Wang","doi":"10.1016/j.displa.2025.103065","DOIUrl":"10.1016/j.displa.2025.103065","url":null,"abstract":"<div><div>Blind image quality assessment (BIQA) has significantly progressed due to rapid advancements in deep learning techniques. However, the objective of the BIQA problem remains ambiguous and is typically approached from two perspectives: the technical perspective, which evaluates the perception of distortions; and the aesthetic perspective, which focuses on content preference and recommendations. Most existing studies predominantly focus on the technical perspective, with relatively few studies addressing the aesthetic perspective. To address this problem, this paper proposes the Aesthetic-Technical Aggregation Quality Assessment (ATAQA) framework by leveraging aesthetic dataset construction and hierarchical feature fusion. Specifically, to enhance aesthetic expression, we first design the Pre-trained Aesthetic-Technical Aggregation (PATA) strategy, whose capabilities for aesthetic feature learning are improved by the Image Aesthetic Quality Dataset (IAQD). Further, we design the Dense Feature Aggregation (DFA) module, that integrates the transformer features hierarchically into the quality-aware feature representation, enabling the model to utilize visual information from low to high levels. Extensive results on several IQA datasets demonstrate that ATAQA significantly outperforms current state-of-the-art (SOTA) methods. Our code will be available after the paper is accepted.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103065"},"PeriodicalIF":3.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-16DOI: 10.1016/j.displa.2025.103090
Jianan Zhang, Qing Zhang, Jiansheng Wang, Yan Wang, Qingli Li
{"title":"A dual branch based stitching method for whole slide hyperspectral pathological imaging","authors":"Jianan Zhang, Qing Zhang, Jiansheng Wang, Yan Wang, Qingli Li","doi":"10.1016/j.displa.2025.103090","DOIUrl":"10.1016/j.displa.2025.103090","url":null,"abstract":"<div><div>Hyperspectral imaging technology integrates spatial information with broadband spectral data that extends beyond the visible spectrum, enabling in-depth analysis of spectral signatures unique to various tissue components. Whole slide image is an important media for pathologists to make diagnosis and image stitching is a basis technology for whole slide image. However, very few researches focus on microscopic hyperspectral image stitching due to its characteristics of memory-intensive and high-quality image feature sparsity. To address these limitations, we propose a High-quality Hyperspectral whole slide Image Stitching Method (HHISM) for pathological sample. Considering that color image can offer high spatial resolution, we introduce a color image guided dual branch method for hyperspectral image stitching. The proposed strategy with color-hyperspectral pair as model input works with common spectral scanning hyperspectral microscope hardware to improve data acquisition efficiency. To further enhance performance, we incorporate inter-branch data sharing module, enabling information exchange between color image stitching and hyperspectral image stitching branches. This interaction can enhance the quality of hyperspectral whole slide image stitching. We have conducted comprehensive experiments on three kinds of samples to evaluate the effectiveness of our proposed method. Experimental results on H&E samples demonstrate that our method outperforms the state-of-the-art medical image stitching methods in terms of quality and hardware-software interaction.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103090"},"PeriodicalIF":3.7,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144107789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-14DOI: 10.1016/j.displa.2025.103069
Jing Wang, Guohan Liu, Wenxin Ding, Yuying Li, Wanying Song
{"title":"From visual understanding to 6D pose reconstruction: A cutting-edge review of deep learning-based object pose estimation","authors":"Jing Wang, Guohan Liu, Wenxin Ding, Yuying Li, Wanying Song","doi":"10.1016/j.displa.2025.103069","DOIUrl":"10.1016/j.displa.2025.103069","url":null,"abstract":"<div><div>Object pose estimation, as a key problem in computer vision, plays an important role in tasks such as autonomous driving and robot navigation. However, most of the existing reviews discuss both traditional and deep learning methods and fail to comprehensively define instance-level and category-level object pose estimation methods. To help researchers better understand this field, this paper summarizes instance-level, category-level, and unseen object and articulated body pose estimation methods in detail, filling the gap in the discussion of these emerging areas in existing reviews. Depending on the different modalities of the input data, the implementations, application domains, training paradigms, network architectures, and their strengths and weaknesses of the deep learning-based object position estimation methods are highlighted, and the performance of these methods on different datasets is compared. In addition, this paper comprehensively combs through the evaluation metrics and benchmark datasets in this field, deeply analyzes their application scope and applicability in different scenarios, and reveals the key roles of these metrics and datasets in promoting technological progress and solving practical problems. Facing the current technical bottlenecks, this paper also looks forward to the future development direction from the cutting-edge explorations of multi-view fusion, cross-modal data integration and novel neural networks, which provide brand new ideas and references to push forward the breakthrough progress in the field of object attitude estimation.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103069"},"PeriodicalIF":3.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low scattering and fast response polymer brush-stabilized liquid crystal microlens array with tunable focal length","authors":"Zhenyao Bian, Liang Fang, Wenqiang Wang, Hongbo Lu, Miao Xu","doi":"10.1016/j.displa.2025.103085","DOIUrl":"10.1016/j.displa.2025.103085","url":null,"abstract":"<div><div>Introducing polymers into liquid crystal (LC) devices is an effective way to increase the response speed, but it can also cause additional scattering. A scattering-free and fast-response LC microlens array (LCMLA) based on surface-initiated polymerization (SIP) is demonstrated. Unlike the conventional fabrication of the polymer network liquid crystal (PNLC), the initiator is encapsulated in an alignment layer on the surface of the substrate, separate from the reacting monomers. Ring array patterned electrodes, carefully designed, generate a non-uniform electric field, which in turn induces a gradient refractive index distribution in the LC layer. Upon UV polymerization, the reaction initiates at active sites on the surface, and polymer fibers grow directionally from the substrate, resembling bush-like structures confined to the surface layer. LCMLAs fabricated using this method exhibits a haze of only 13.8 %, offering higher transmittance compared to conventional PNLC MLA. Due to these characteristics, the polymer brush-stabilized liquid crystal (PBSLC) MLA has better imaging performance. Additionally, the response time of the PBSLC MLA is 2.32 ms. Consequently, this PBSLC MLA hold significant potential for applications in the optical communication, fast-switching displays, beam steering and adaptive optics.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103085"},"PeriodicalIF":3.7,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-10DOI: 10.1016/j.displa.2025.103070
Syed Muhammad Salman Bukhari , Nadia Dahmani , Sujan Gyawali , Muhammad Hamza Zafar , Filippo Sanfilippo , Kiran Raja
{"title":"Optimizing fire detection in remote sensing imagery for edge devices: A quantization-enhanced hybrid deep learning model","authors":"Syed Muhammad Salman Bukhari , Nadia Dahmani , Sujan Gyawali , Muhammad Hamza Zafar , Filippo Sanfilippo , Kiran Raja","doi":"10.1016/j.displa.2025.103070","DOIUrl":"10.1016/j.displa.2025.103070","url":null,"abstract":"<div><div>Wildfires are increasing in frequency and severity, presenting critical challenges for timely detection and response, particularly in remote or resource-limited environments. This study introduces the Inception-ResNet Transformer with Quantization (IRTQ), a novel hybrid deep learning (DL) framework that integrates multi-scale feature extraction with global attention and advanced quantization. The proposed model is specifically optimized for edge deployment on platforms such as unmanned aerial vehicles (UAVs), offering a unique combination of high accuracy, low latency, and compact memory footprint. The IRTQ model achieves 98.9% accuracy across diverse datasets and shows strong generalization through cross-dataset validation. Quantization significantly reduces the parameter count to 0.09M and memory usage to 0.13 MB, enabling real-time inference in 3 ms. Interpretability is further enhanced through Grad-CAM visualizations, supporting transparent decision-making. While achieving state-of-the-art performance, the model encounters challenges in visually ambiguous fire-like regions. To address these, future work will explore multi-modal inputs and extend the model towards multi-class classification. IRTQ represents a technically grounded, interpretable, and deployable solution for AI-driven wildfire detection and disaster response.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103070"},"PeriodicalIF":3.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143935752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-08DOI: 10.1016/j.displa.2025.103067
Jiamou Yang , Yangtao Wang , Xin Tan , Meie Fang , Lizhuang Ma
{"title":"DHP-SLAM: A real-time visual slam system with high positioning accuracy under dynamic environment","authors":"Jiamou Yang , Yangtao Wang , Xin Tan , Meie Fang , Lizhuang Ma","doi":"10.1016/j.displa.2025.103067","DOIUrl":"10.1016/j.displa.2025.103067","url":null,"abstract":"<div><div>The traditional visual SLAM framework is assumed to be carried out in an ideal static environment. Once dynamic objects appear in the real scene, the appearance of dynamic objects greatly affects the positioning accuracy of visual SLAM system. In order to solve the above problems, a real-time multi-target tracking semantic visual SLAM system named DHP-SLAM is proposed in this paper. The dynamic visual SLAM algorithm combined with semantic instance segmentation and geometric constraint methods can eliminate the influence of high dynamic objects and potential dynamic objects, and can accurately segment objects in real time to improve the positioning of DHP-SLAM system. Secondly, multi-target tracking is integrated into the DHP-SLAM system, and the feature points of dynamic objects are eliminated by using the predicted target tracking frame when the target detection is missed, which makes the SLAM system have higher robustness and higher understanding ability of the surrounding environment. DHP-SLAM evaluated the algorithm on indoor data set TUM and outdoor data set KITTI, and conducted a large number of experiments to compare the proposed method with the state-of-the-art dynamic SLAM. In TUM indoor data set, our system has a great improvement compared with the original ORB-SLAM2 system. In KITTI data dynamic scenario, our system positioning accuracy has a great improvement in KITTI outdoor data set dynamic scenario, and it also has advantages compared with other dynamic visual SLAM systems.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103067"},"PeriodicalIF":3.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143928787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-08DOI: 10.1016/j.displa.2025.103059
Jinhao Li , Zijian Chen , Tingzhu Chen , Zhiji Liu , Changbo Wang
{"title":"OBIFormer: A fast attentive denoising framework for oracle bone inscriptions","authors":"Jinhao Li , Zijian Chen , Tingzhu Chen , Zhiji Liu , Changbo Wang","doi":"10.1016/j.displa.2025.103059","DOIUrl":"10.1016/j.displa.2025.103059","url":null,"abstract":"<div><div>Oracle bone inscriptions (OBI) are the earliest known form of Chinese characters and serve as a valuable resource for research in anthropology and archaeology. However, most excavated fragments are severely degraded due to thousands of years of natural weathering, corrosion, and man-made destruction, making automatic OBI recognition extremely challenging. Previous methods either focus on pixel-level information or utilize vanilla transformers for glyph-based OBI denoising, which leads to tremendous computational overhead. Therefore, this paper proposes a fast attentive denoising framework for oracle bone inscriptions, i.e., OBIFormer. It leverages channel-wise self-attention, glyph extraction, and selective kernel feature fusion to reconstruct denoised images precisely while being computationally efficient. Our OBIFormer achieves state-of-the-art denoising performance for PSNR and SSIM metrics on synthetic and original OBI datasets. Furthermore, comprehensive experiments on a real-world OBI dataset demonstrate the great potential of our OBIFormer in assisting automatic OBI recognition. The code will be made available at <span><span>https://github.com/LJHolyGround/OBIFormer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103059"},"PeriodicalIF":3.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143935753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-06DOI: 10.1016/j.displa.2025.103063
Yuanyuan Li, Zetian Mi, Xianping Fu
{"title":"Multiframe-to-multiframe network for underwater unpaired video enhancement","authors":"Yuanyuan Li, Zetian Mi, Xianping Fu","doi":"10.1016/j.displa.2025.103063","DOIUrl":"10.1016/j.displa.2025.103063","url":null,"abstract":"<div><div>Underwater video enhancement (UVE) technology plays an indispensable role in accurately perceiving underwater environments. In recent years, researchers have proposed many high-performance underwater image enhancement (UIE) techniques. However, these methods enhance each frame independently, ignoring complementary information between adjacent frames over time, which can lead to visual flickering. Additionally, it is impractical to simultaneously capture degraded underwater videos and their high-quality counterparts. Considering these factors, a multiframe-to-multiframe network for unpaired underwater video enhancement (MMUVE) is proposed for the first time. First, a generative adversarial network based on unpaired contrastive learning is designed to conduct adversarial training between the selected key frames from the video frame sequence and unpaired high-quality images, resulting in an initially optimized video frame sequence. Then, the original frame sequence undergoes temporal enhancement, while the initially optimized frame sequence is subjected to secondary optimization in the spatial-channel dimension. Finally, a dual-branch feature fusion is performed to obtain multi-frame enhancement results. Extensive subjective and objective comparative experiments demonstrate that the proposed method not only maintains temporal consistency during multi-frame enhancement but also achieves better single-frame image enhancement results.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103063"},"PeriodicalIF":3.7,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143941624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-05-01DOI: 10.1016/j.displa.2025.103073
Jianing Zhou , Chao Long , Hao Zou , Yan Han , Hui Tan , Liming Duan
{"title":"Non-uniform sparse scanning angle selection method for limited angle industrial CT detection of laminated cells","authors":"Jianing Zhou , Chao Long , Hao Zou , Yan Han , Hui Tan , Liming Duan","doi":"10.1016/j.displa.2025.103073","DOIUrl":"10.1016/j.displa.2025.103073","url":null,"abstract":"<div><div>Laminated cells can be rapidly scanned using sparse angle computed tomography (CT), but the traditional uniform sparse scanning method fails to adequately capture internal structural differences, leading to missing structures in the reconstructed image. To address this issue, we introduce a scanning method—a non-uniform sparse scanning angle selection method for limited angle industrial CT detection of laminated cells. First, the spectrum distribution map is generated by applying Fourier transform to the projection data. A threshold is established by taking the average of frequency amplitudes. Next, the number of frequency categories with amplitudes exceeding the threshold is counted to select a suitable limited angle range. Then, the non-uniform sparse scanning angles within the limited angle range are determined based on the singularity distribution curve in the projection domain. This scanning method ensures that more relevant data is collected while avoiding the data redundancy. Finally, the effectiveness of the proposed method is verified through numerical simulation and actual scanning experiments. In comparison with the latest scanning angle selection methods, our method collects more data and significantly improves image reconstruction quality while maintaining the same number of scanning angles.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"89 ","pages":"Article 103073"},"PeriodicalIF":3.7,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143916171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}