DisplaysPub Date : 2025-02-27DOI: 10.1016/j.displa.2025.102992
Shuchang Zhou , Hanxin Wang , Qingbo Wu , Fanman Meng , Linfeng Xu , Wei Zhang , Hongliang Li
{"title":"Adversarially Regularized Tri-Transformer Fusion for continual multimodal egocentric activity recognition","authors":"Shuchang Zhou , Hanxin Wang , Qingbo Wu , Fanman Meng , Linfeng Xu , Wei Zhang , Hongliang Li","doi":"10.1016/j.displa.2025.102992","DOIUrl":"10.1016/j.displa.2025.102992","url":null,"abstract":"<div><div>Continual egocentric activity recognition aims to understand first-person activity from the multimodal data captured from wearable devices in streaming environments. Existing continual learning (CL) methods hardly acquire discriminative multimodal representations of activity classes from different isolated stages. To address this issue, this paper proposes an Adversarially Regularized Tri-Transformer Fusion (ARTF) model composed of three frozen transformer backbones with dynamic expansion architecture, which enables flexible and progressive multimodal representation fusion in the CL setting. To mitigate the confusion across different stages, we adopt an adversary-based confusion feature generation strategy to augment unknown classes, explicitly simulating out-stage features that closely resemble those within the stage. Then, the discriminative multimodal fusion representations could be learned by joint training on the current and augmented data at different stages. Experiments show that our model significantly outperforms state-of-the-art CL methods for multimodal continual egocentric activity recognition.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102992"},"PeriodicalIF":3.7,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-26DOI: 10.1016/j.displa.2025.103007
Jiale Chao, Jialin Lei, Xionghui Zhou, Le Xie
{"title":"A general and flexible point cloud simplification method based on feature fusion","authors":"Jiale Chao, Jialin Lei, Xionghui Zhou, Le Xie","doi":"10.1016/j.displa.2025.103007","DOIUrl":"10.1016/j.displa.2025.103007","url":null,"abstract":"<div><div>Large-scale, high-density point cloud data often pose challenges for direct application in various downstream tasks. To address this issue, this paper introduces a flexible point cloud simplification method based on feature fusion. After conducting a comprehensive analysis of the input point cloud, the method fuses the density feature that reflects point cloud uniformity with local geometric features that capture shape details. Based on the simplification objectives and fused feature values, the method optimizes the point distribution from a global perspective. Subsequently, by removing distance factors, purely local geometric features are incorporated into the farthest point sampling process and a feature-weighted voxel farthest point sampling algorithm is proposed to prioritize the preservation of local feature points. With a refined mechanism for adjusting point numbers, the method finally achieves fast and reasonable simplification of massive point clouds. Furthermore, extensive experiments have been designed to explore the impact of the features involved and their sensitivity to simplification results, offering detailed recommendations for parameter configuration. This method supports flexible transitions between global uniformity and heavy local feature preservation. Comparative results with previous studies demonstrate its excellent balance, exhibiting strong competitiveness in both output point cloud quality and computational efficiency. The core source code is publicly available at: <span><span>https://github.com/chaojiale/PointCloudSimplification</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103007"},"PeriodicalIF":3.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-25DOI: 10.1016/j.displa.2025.103003
Lianmin Zhang , Hongkui Wang , Qionghua Luo , Wei Zhang , Haibing Yin , Tiansong Li , Li Yu , Wenyao Zhu
{"title":"Bayesian generation based foveated JND estimation in the DCT domain","authors":"Lianmin Zhang , Hongkui Wang , Qionghua Luo , Wei Zhang , Haibing Yin , Tiansong Li , Li Yu , Wenyao Zhu","doi":"10.1016/j.displa.2025.103003","DOIUrl":"10.1016/j.displa.2025.103003","url":null,"abstract":"<div><div>The Just Noticeable Distortion (JND) threshold refers to the inability of the human visual system (HVS) to perceive pixel changes below a certain visibility threshold. In this paper, we focus on the cross-domain operation problem of JND estimation in the DCT domain. In order to solve this problem and improve the accuracy of DCT-JND estimation, we design an autoregressive model based on the Bayesian generation theory to simulate the spontaneous predictive behavior of HVS. Based on this model, an entropy masking (EM) effect based JND moderator is then proposed. Considering the visual attention and foveated masking (VFM) effect, this paper predicts visual saliency and the fixation points in the DCT domain, an enhanced foveated masking effect based JND moderator is then presented. Finally, combined with other JND moderators, the Bayesian generation based foveated DCT-JND model is obtained. Subjective and objective experimental results show that the proposed model could further improve the accuracy of JND threshold estimation in the DCT domain while avoiding the cross-domain operation.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103003"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-25DOI: 10.1016/j.displa.2025.103002
Hui Hu , Yunhui Shi , Jin Wang , Nam Ling , Baocai Yin
{"title":"Feature enhanced spherical transformer for spherical image compression","authors":"Hui Hu , Yunhui Shi , Jin Wang , Nam Ling , Baocai Yin","doi":"10.1016/j.displa.2025.103002","DOIUrl":"10.1016/j.displa.2025.103002","url":null,"abstract":"<div><div>It is well known that the wide field of view of spherical images requires high resolution, which increases the challenges of storage and transmission. Recently, a spherical learning-based image compression method called OSLO has been proposed, which leverages HEALPix’s approximately uniform spherical sampling. However, HEALPix sampling can only utilize a fixed 3 × 3 convolution kernel, resulting in a limited receptive field and an inability to capture non-local information. This limitation hinders redundancy removal during the transform and texture synthesis during the inverse transform. To address this issue, we propose a feature-enhanced spherical Transformer-based image compression method that leverages HEALPix’s hierarchical structure. Specifically, to reduce the computational complexity of the Transformer’s attention mechanism, we divide the sphere into multiple windows using HEALPix’s hierarchical structure and compute attention within these spherical windows. Since there is no communication between adjacent windows, we introduce spherical convolution to aggregate information from neighboring windows based on their local correlation. Additionally, to enhance the representational ability of features, we incorporate an inverted residual bottleneck module for feature embedding and a feedforward neural network. Experimental results demonstrate that our method outperforms OSLO, achieving lower codec time.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103002"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-24DOI: 10.1016/j.displa.2025.103008
Zhao Yanzeng , Zhu Keyong , Xu Haixin , Liu Ziyu , Luo Pengyu , Wang Lijing
{"title":"Is red alert always optimal? An empirical study on the effects of red and blue feedback on performance under excessive stress","authors":"Zhao Yanzeng , Zhu Keyong , Xu Haixin , Liu Ziyu , Luo Pengyu , Wang Lijing","doi":"10.1016/j.displa.2025.103008","DOIUrl":"10.1016/j.displa.2025.103008","url":null,"abstract":"<div><h3>Background</h3><div>In critical situations, a pilot’s ability to recognize and adjust excessive stress levels is vital for risk mitigation, especially in single pilot operations where self-awareness is crucial. Research on monitoring and feedback for excessive stress is limited, and few studies have examined how visual feedback from display interfaces can enhance pilot performance. Traditional alert interfaces predominantly use red feedback, but the unique cognitive characteristics associated with excessive stress may lead to negative outcomes when red feedback is employed. Therefore, it is essential to investigate the effectiveness of feedback under these conditions.</div></div><div><h3>Methods</h3><div>This study utilized the MATB (Multi-tasking Ability Task Battery), an effective abstract flight task experimental prototype with stress inducing in participants through the TSST (Trier Social Stress Test) paradigm. The categories of stress were assessed using the Yerkes-Dodson Law. Audio signal was used to train a Probabilistic Neural Network (PNN) model for real-time discrimination of excessive stress levels, providing participants with one of three types of visual feedback: no feedback, red feedback, or blue feedback. The experiment was designed with an within-subjects approach, involving 20 participants.</div></div><div><h3>Results</h3><div>There were no significant differences in primary task performance. However, secondary task performance was significantly poor under red feedback compared to blue feedback. Additionally, there were no significant differences between red feedback and no feedback conditions.</div></div><div><h3>Conclusion</h3><div>This study suggests that feedback for excessive stress should take into account its unique characteristics, recommending caution in the use of red alerts. The findings provide valuable insights for future human–computer interface design.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103008"},"PeriodicalIF":3.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-24DOI: 10.1016/j.displa.2025.103005
Cui Gan, Chaofeng Li, Gangping Zhang, Guanghua Fu
{"title":"DBNDiff: Dual-branch network-based diffusion model for infrared ship image super-resolution","authors":"Cui Gan, Chaofeng Li, Gangping Zhang, Guanghua Fu","doi":"10.1016/j.displa.2025.103005","DOIUrl":"10.1016/j.displa.2025.103005","url":null,"abstract":"<div><div>Infrared ship image super-resolution (SR) is important for dim and small ship object detection and tracking. However, there are still challenges for large-scale factors SR of infrared ship images, as infrared images require a greater amount of global edge information compared to visible images. To overcome this challenge, we introduce a novel dual-branch network-based diffusion model (DBNDiff) for infrared ship image SR, which incorporates a noise prediction (NP) branch and an edge reconstruction (ER) branch within its conditional noise prediction network (CNPN). In the NP branch, to perform better noise prediction, a hybrid cross-attention (HCA) block is used for the interaction between global and local information. In the ER branch, ER blocks are stacked to extract edge information. Furthermore, an edge loss function is introduced to preserve more edges and details. Extensive experiments on infrared ship image datasets highlight that our DBNDiff outperforms other SR methods, especially showing the best visual quality at large-scale factors SR tasks.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103005"},"PeriodicalIF":3.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143511230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DARF: Depth-Aware Generalizable Neural Radiance Field","authors":"Yue Shi, Dingyi Rong, Chang Chen, Chaofan Ma, Bingbing Ni, Wenjun Zhang","doi":"10.1016/j.displa.2025.102996","DOIUrl":"10.1016/j.displa.2025.102996","url":null,"abstract":"<div><div>Neural Radiance Field (NeRF) has revolutionized novel-view rendering tasks and achieved impressive results. However, the inefficient sampling and per-scene optimization hinder its wide applications. Though some generalizable NeRFs have been proposed, the rendering quality is unsatisfactory due to the lack of geometry and scene uniqueness. To address these issues, we propose the Depth-Aware Generalizable Neural Radiance Field (DARF) with a Depth-Aware Dynamic Sampling (DADS) strategy to perform efficient novel view rendering and unsupervised depth estimation on unseen scenes without per-scene optimization. Distinct from most existing generalizable NeRFs, our framework infers the unseen scenes on both pixel level and geometry level with only a few input images. By introducing a pre-trained depth estimation module to derive the depth prior, narrowing down the ray sampling interval to the proximity space of the estimated surface, and sampling in expectation maximum position, we preserve scene characteristics while learning common attributes for novel-view synthesis. Moreover, we introduce a Multi-level Semantic Consistency loss (MSC) to assist with more informative representation learning. Extensive experiments on indoor and outdoor datasets show that compared with state-of-the-art generalizable NeRF methods, DARF reduces samples by 50%, while improving rendering quality and depth estimation. Our code is available on <span><span>https://github.com/shiyue001/DARF.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102996"},"PeriodicalIF":3.7,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-level perturbations in image and feature spaces for semi-supervised medical image segmentation","authors":"Feiniu Yuan , Biao Xiang , Zhengxiao Zhang , Changhong Xie , Yuming Fang","doi":"10.1016/j.displa.2025.103001","DOIUrl":"10.1016/j.displa.2025.103001","url":null,"abstract":"<div><div>Consistency regularization has emerged as a vital training strategy for semi-supervised learning. It is very important for medical image segmentation due to rare labeled data. To greatly enhance consistency regularization, we propose a novel Semi-supervised Learning framework with Multi-level Perturbations (SLMP) in both image and feature spaces. In image space, we propose external perturbations with three levels to greatly increase data variations. A low-level perturbation uses traditional augmentation techniques for firstly expanding data. Then, a middle-level one adopts copying and pasting techniques to combine low-level augmented versions of labeled and unlabeled data for generating new images. Middle-level perturbed images contain novel contents, which are totally different from original ones. Finally, a high-level one generates images from middle-level augmented data. In feature space, we design an Indicative Fusion Block (IFB) to propose internal perturbations for randomly mixing the encoded features of middle and high-level augmented images. By utilizing multi-level perturbations, we design a student–teacher semi-supervised learning framework for effectively improving the model resilience to strong variances. Experimental results show that our model achieves the state-of-the-art performance across various evaluation metrics on 2D and 3D medical image datasets. Our model exhibits the powerful capability of feature learning, and significantly outperforms existing state-of-the-art methods. Intensive ablation studies prove that our contributions are effective and significant. The model code is available at <span><span>https://github.com/CamillerFerros/SLMP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103001"},"PeriodicalIF":3.7,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-19DOI: 10.1016/j.displa.2025.102995
Wenyang Wang , Peng Zhang , Yuanxin Wang , Shoufeng Tong
{"title":"Investigation into preventing piracy based on the temporal perception difference between devices and humans using modulated projection light","authors":"Wenyang Wang , Peng Zhang , Yuanxin Wang , Shoufeng Tong","doi":"10.1016/j.displa.2025.102995","DOIUrl":"10.1016/j.displa.2025.102995","url":null,"abstract":"<div><div>The piracy of intellectual property rights and privacy through unauthorized capturing of images and videos has been increasing rapidly. We introduce a new methodology for preventing piracy, that utilizes a light source that emits specially modulated projection light to embed imperceptible watermark patterns in images and videos, thereby degrading their quality. The modulation ways of light source exploit the temporal perception difference between the human visual system (HVS) and the image sensor devices. We employed a model-driven approach to optimize the modulation ways of light source in order to effectively prevent piracy. We have also designed experiments to discuss the degradation of image quality at different factors and evaluate the effectiveness of the proposed method. Extensive objective evaluations under different scenarios demonstrated that our method can effectively prevent piracy on various smartphones. Subjective tests on volunteers demonstrated that the modulated light source appears to the HVS to be the same as a steady light source.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102995"},"PeriodicalIF":3.7,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-18DOI: 10.1016/j.displa.2025.103004
Sahul Hameed Syed Ali, Seung-Ho Hong, Jang-Kun Song
{"title":"Large aperture nano-colloidal lenses with dual-hole electrodes for reduced image distortion","authors":"Sahul Hameed Syed Ali, Seung-Ho Hong, Jang-Kun Song","doi":"10.1016/j.displa.2025.103004","DOIUrl":"10.1016/j.displa.2025.103004","url":null,"abstract":"<div><div>Focus-tunable lenses without mechanical components are highly beneficial across various fields, including augmented reality (AR) devices, yet achieving a practical level of this technology is challenging. Recently, nano-colloidal lenses employing two-dimensional (2D) ZrP nanoparticles have been proposed as a simple and promising method to develop an electric-field-induced focus-tunable lens system. In this study, we investigate the relationship between the electrode design of nano-colloidal lenses and their performance, particularly in terms of focal length tunability and image distortion. In previous designs, increasing the lens size led to significant image distortion. To address this issue, we introduced a dual-hole electrode design and optimized the electrode size. This modification resulted in a wider focal length tunability and minimized image distortion, even in larger lenses. Additionally, we experimentally measured the refractive index variation and approximated the nanoparticle distribution to further optimize the lens’s focal length and image distortion. Consequently, this study provides a comprehensive model for designing nano-colloidal lenses and electrodes, paving the way for their use in various applications.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103004"},"PeriodicalIF":3.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}