Displays最新文献_第7页

WaveletMamba: Wavelet-based state space model for low light image enhancement 小波曼巴：基于小波的弱光图像增强状态空间模型

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-10 DOI: 10.1016/j.displa.2025.103100

Kai Shang , Mingwen Shao , Chao Wang , Xiaodong Tan

{"title":"WaveletMamba: Wavelet-based state space model for low light image enhancement","authors":"Kai Shang , Mingwen Shao , Chao Wang , Xiaodong Tan","doi":"10.1016/j.displa.2025.103100","DOIUrl":"10.1016/j.displa.2025.103100","url":null,"abstract":"<div><div>Images captured in low-light environments often suffer from poor visibility and substantial noise, which not only affects human perception but also hinders high-level visual tasks. Notable advancements have been made for Low Light Image Enhancement (LLIE) with deep learning methods based on convolutional neural networks, transformers, and diffusion models. Recently, State Space Models (SSM) have demonstrated impressive performance and gained increasing attention. However, existing LLIE methods predominantly extract features in the spatial domain, overlooking the frequency characteristics. Moreover, although current SSM enable efficient long-sequence processing, the vanilla scanning mechanism limits their ability to model global context effectively. To address these challenges, we propose a wavelet-based state space model named WaveletMamba for the LLIE task. Specifically, we utilize wavelet transformation to decompose the image into low-frequency components related to illumination and high-frequency components associated with details, manipulating these components in parallel. This approach leverages the frequency-domain property of the image to achieve more exclusive brightness enhancement without compromising high-frequency details. Furthermore, we propose an efficient mamba block, introducing a position-aware module to mitigate the structural losses caused by the scanning mechanism. By incorporating spatial information, the relationships between pixels are highlighted, thus improving the representational capacity of the network. Extensive experiments demonstrate that our WaveletMamba is favorable compared to state-of-the-art approaches with better efficiency and excellent performance.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103100"},"PeriodicalIF":3.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144307020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MedScreenDental: Automated structured dental record generation via multimodal language model integration medscreen endental：通过多模态语言模型集成自动生成结构化牙科记录

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-10 DOI: 10.1016/j.displa.2025.103119

Wenzhong Jin , Yilan Sun , Kaiyuan Ji , Xiaoyan Jiang , Yufeng Hu , Jinwu Wang , Jiannan Liu

引用次数: 0

Lightweight semantic visual mapping and localization based on ground traffic signs 基于地面交通标志的轻量级语义视觉映射和定位

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-09 DOI: 10.1016/j.displa.2025.103096

Jian Lian , Shi Chen , Ge Guo , Duo Sui , Jian Zhao , Linhui Li

{"title":"Lightweight semantic visual mapping and localization based on ground traffic signs","authors":"Jian Lian , Shi Chen , Ge Guo , Duo Sui , Jian Zhao , Linhui Li","doi":"10.1016/j.displa.2025.103096","DOIUrl":"10.1016/j.displa.2025.103096","url":null,"abstract":"<div><div>Traditional high-definition semantic map construction methods rely on mapping vehicles equipped with high-precision sensors, such as LiDAR (Light Detection and Ranging), and cannot adapt to large-scale road scenarios with drastic changes. Addressing these issues, we propose a lightweight high-definition mapping and localization method, which uses low-cost sensors and can automatically construct marker-level semantic maps and accurately estimate lane-level localization information in a purely visual approach. Firstly, we introduce an interactive semi-automated annotation method based on a large vision model and construct a semantic segmentation dataset targeting traffic signs with long-term stability. Then, by transforming traffic signs into BEV (Bird’s Eye View) features and a vectorized format, we construct the global semantic map and achieve highly stable lane-level localization by matching static environmental features. Finally, we validate the feasibility and effectiveness of the method on complex scenarios with public datasets and conduct comparative evaluations with other methods. The experimental results show that our method using low-cost sensors performs well in terms of accuracy and robustness, and it significantly reduces the hardware requirements and application difficulties of high-definition mapping and lane-level localization in autonomous driving.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103096"},"PeriodicalIF":3.7,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144291085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A deep learning approach for music visualization: From audio features to descriptive video generation 音乐可视化的深度学习方法：从音频特征到描述性视频生成

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-09 DOI: 10.1016/j.displa.2025.103103

Fan Huang , Zhixin Xu , Xiongkuo Min , Song Song

{"title":"A deep learning approach for music visualization: From audio features to descriptive video generation","authors":"Fan Huang , Zhixin Xu , Xiongkuo Min , Song Song","doi":"10.1016/j.displa.2025.103103","DOIUrl":"10.1016/j.displa.2025.103103","url":null,"abstract":"<div><div>This paper proposes a deep learning-based audio visualization method designed to generate video content synchronized with the audio’s style and rhythm through comprehensive analysis of multi-modal features including emotional semantics, stylistic patterns, rhythmic structures, and instrumental signatures. Conventional audio visualization approaches primarily generate videos through basic signal features such as spectral frequency and beat tracking, yet fail to interpret high-level auditory semantics including emotional contexts and stylistic complexity, resulting in mismatch between the visual content and the audio emotion. The innovation of this paper lies in its multi-dimensional audio analysis, which, combined with a Large Language Model, generates precise visual descriptions, followed by the use of a Text-to-Image model to create images that align with the audio’s style. The synthesized images are subsequently temporally aligned with audio streams via frame interpolation model, ensuring time alignment and dynamic consistency between the audio and video content. Experimental results demonstrate that the proposed method effectively ensures the quality of audio visualization, making the generated videos more closely aligned with the emotional and rhythmic changes in the audio.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103103"},"PeriodicalIF":3.7,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144254339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How task demands influence driver behaviour in conditionally automated driving: An investigation of situation awareness and takeover performance 任务需求如何影响有条件自动驾驶中的驾驶员行为：对情境意识和接管性能的调查

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-07 DOI: 10.1016/j.displa.2025.103117

Yihao Si , Wuhong Wang , Mengzhu Guo , Haiqiu Tan , Dongxian Sun , Haodong Zhang

{"title":"How task demands influence driver behaviour in conditionally automated driving: An investigation of situation awareness and takeover performance","authors":"Yihao Si , Wuhong Wang , Mengzhu Guo , Haiqiu Tan , Dongxian Sun , Haodong Zhang","doi":"10.1016/j.displa.2025.103117","DOIUrl":"10.1016/j.displa.2025.103117","url":null,"abstract":"<div><div>In conditionally automated driving, driver capability and task demands are crucial for safe takeover transitions. Identifying factors influencing task demand and driver capability, as well as exploring their combined effects on driver behaviour and perception, are essential for developing models that optimise driver performance. To simulate varying task demands, we adjusted the urgency of the takeover time budget (TOTB) and the complexity of traffic scenarios (i.e., TOR-Lane, the lane where the vehicle was located when the takeover request occurred), while manipulating driver capability by introducing non-driving related tasks (NDRTs). A multilevel modelling approach was employed to analyse how these factors jointly influenced takeover behaviour and situation awareness (SA). Results indicated that TOTB, NDRT, and TOR-Lane influenced takeover timeliness at different time stages: NDRT affected driver reaction time, while TOTB and TOR-Lane impacted information processing time (IPT). A shorter TOTB resulted in reduced IPT and lower minimum time-to-collision [min (TTC)], especially when visual-cognitive NDRT were involved, which further impaired takeover quality. Moreover, increased traffic environment complexity prolonged IPT and reduced min (TTC). To meet task demands, drivers adjusted their visual behaviour to rapidly restore SA by reducing the quality of visual processing for low-priority elements, thereby prioritising resources to takeover tasks. Participants’ SA improved as TOTB increased, reaching saturation levels that varied with scenario complexity—7–9 s in the centre lane and 5–7 s in the side lane. This study reveals how driver behavioural patterns are influenced by task demands and their own capabilities, supporting the design of adaptable human–machine interaction models.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103117"},"PeriodicalIF":3.7,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144280517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Max-Game: A game-theoretic approach to parameter selection for non-iterative adversarial image steganography 非迭代对抗图像隐写参数选择的博弈论方法

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-07 DOI: 10.1016/j.displa.2025.103093

Junfeng Zhao, Shen Wang, Fanghui Sun

{"title":"Max-Game: A game-theoretic approach to parameter selection for non-iterative adversarial image steganography","authors":"Junfeng Zhao, Shen Wang, Fanghui Sun","doi":"10.1016/j.displa.2025.103093","DOIUrl":"10.1016/j.displa.2025.103093","url":null,"abstract":"<div><div>Despite existing incremental perturbation-based adversarial steganographic schemes have achieved great success, and their parameters can be determined only with a small number of experiments. Unfortunately, the reliance on uniform parameter settings for all input images may not be optimal for individual cases. Unlike existing works, our aim of this paper is to illustrate that the necessity of parameter selection for each image can enhance the resistance against re-trained steganalyzers while improving their statistical undetectability. Motivated by this, we model the existing parameter-tuning process of incremental perturbation-based schemes as a simple game system between Alice (the steganographer) and Eve (the steganalyst), and demonstrate that the essence of this dynamic process is an unstable Nash equilibrium. To apply this system to each image, we propose a novel game-theoretic framework that allows the dynamic interaction between Alice and Eve, and our approach enables Alice to adjust the parameters for each image based on Eve’s feedback. To verify its effectiveness, we integrate our previous work SAL into this game, and call it SAL-Game-v2. Extensive experimental results illustrate that SAL-Game-v2 helps Alice find more suitable parameters for each image when using SAL, with an average improvement of 1%–6% in the detection error rate (DER) when resisting Eve’s re-trained steganalyzers.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103093"},"PeriodicalIF":3.7,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Underwater vision clearing with latent space alignment for marine scene enhancement 基于潜空间对齐的水下视觉清除技术用于海洋场景增强

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-05 DOI: 10.1016/j.displa.2025.103094

Yifan Liu , Chuanbo Zhu , Ke Luo , Jincai Chen

{"title":"Underwater vision clearing with latent space alignment for marine scene enhancement","authors":"Yifan Liu , Chuanbo Zhu , Ke Luo , Jincai Chen","doi":"10.1016/j.displa.2025.103094","DOIUrl":"10.1016/j.displa.2025.103094","url":null,"abstract":"<div><div>Color shift and marine snow are two severe types of underwater imaging noise, which can significantly impact subsequent computer vision tasks. However, existing methods cannot perform underwater image enhancement (UIE) and marine snow removal (MSR) simultaneously with a unified network architecture and a set of pretrained weights. To address this issue, we propose the two-stage joint UIE and MSR network (TJUMnet). With the proposed latent space alignment-based teacher-guided feature optimization (TFO) and hierarchical reconstruction (HR), TJUMnet can effectively integrate and learn feature representations under various underwater noise conditions. In the first stage, TFO utilizes a well-trained teacher model to guide the primary student model through the latent space feature optimization (LSFO) module, reducing differences between feature domains and enhancing robustness against underwater noise. The second stage, HR, aims to extend and generalize the processing capability for various underwater noise conditions, enabling the student model to achieve stable reconstruction and enhancement effects under different underwater noise conditions. Additionally, we constructed a new joint underwater-image enhancement and marine-snow removal dataset (JUEMR), comprising 2700 image sets covering color shift noise and different intensities of marine snow noise. Extensive experiments on multiple benchmark datasets and the JUEMR dataset demonstrate that TJUMnet reaches state-of-the-art levels in terms of quantitative and visual performance. The code is available at <span><span>https://github.com/awhitewhale/TJUMnet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103094"},"PeriodicalIF":3.7,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144262372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multi-objective route-searching strategy balancing fuel consumption and crash risks 一种平衡油耗和碰撞风险的多目标路径搜索策略

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-04 DOI: 10.1016/j.displa.2025.103088

Tianren Zhang , Yajie Zou , Yangyang Wang , Siyang Zhang , Yue Zhang , Amin Moeinaddini

{"title":"A multi-objective route-searching strategy balancing fuel consumption and crash risks","authors":"Tianren Zhang , Yajie Zou , Yangyang Wang , Siyang Zhang , Yue Zhang , Amin Moeinaddini","doi":"10.1016/j.displa.2025.103088","DOIUrl":"10.1016/j.displa.2025.103088","url":null,"abstract":"<div><div>The Route Guidance System (RGS) has become a crucial component of Intelligent Transportation System as urban road networks grow increasingly complex. Traditional RGSs typically assess road segments based on distance or travel time to suggest optimal routes, while some advanced RGS methods incorporate safety or fuel consumption considerations into route selection. However, there is currently little research focusing on RGS approaches that jointly optimize both safety and fuel efficiency. This study introduces a novel multi-objective route-searching strategy designed to simultaneously reduce crash risk and fuel consumption by dynamically adjusting objective weights. Fuel consumption on road segments is estimated using the Virginia Tech Comprehensive Power-based Fuel Consumption Model, while traffic safety is evaluated through a Safety Performance Function. The proposed approach is applied to a real-world case study utilizing 3D road network data, traffic flow information, and historical crash data for Los Angeles, California. Results demonstrate that traditional route-searching strategies, including single-objective and generalized cost methods, struggle to accommodate the complexities and fluctuations of real-world traffic networks. In contrast, the proposed multi-objective route-searching scheme enhances the overall averages of safety, fuel efficiency, and travel speed in the long run.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103088"},"PeriodicalIF":3.7,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FacePaint: Two-way cross context semantic network for face image inpainting FacePaint：用于面部图像绘制的双向跨上下文语义网络

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-04 DOI: 10.1016/j.displa.2025.103092

Yongsheng Shi , Dongjin Huang , Jinhua Liu , Jiantao Qu , Zhifeng Xie , Lizhuang Ma

{"title":"FacePaint: Two-way cross context semantic network for face image inpainting","authors":"Yongsheng Shi , Dongjin Huang , Jinhua Liu , Jiantao Qu , Zhifeng Xie , Lizhuang Ma","doi":"10.1016/j.displa.2025.103092","DOIUrl":"10.1016/j.displa.2025.103092","url":null,"abstract":"<div><div>Due to the lack of publicly available paired data for occluded and unoccluded faces, current face image inpainting methods struggle to generate high-quality results for naturally occluded face images. Moreover, for large occluded areas, most methods are prone to reconstructing distorted structures and blurred textures, which can destroy the global semantic content of face images. To address these challenges, we propose a transformer-based inpainting framework (FacePaint) for automatically inpainting face images occluded by different types of objects. First, we simulate human faces that are occluded in the real world and construct a new face dataset. Second, to enhance the understanding of global semantics, we propose a Two-way Cross Context Semantic Attention network (TCCSA) by incorporating the self-attention (SA), context-to-semantic (CTS), and semantic-to-context (STC) models. TCCSA can capture the semantic information of faces while extracting long-range contextual features guided by semantic priors. Third, to improve the ability for reconstructing large occluded regions, we propose a novel gated convolution-based feed-forward network (FFN) dedicated to extracting local contextual features of images. Finally, to ensure that FacePaint can focus on the structures and textures of images, as well as the semantic information, a new loss function is proposed to guide its training. Extensive experimental results demonstrate that the proposed FacePaint is significantly superior to the state-of-the-art approaches both qualitatively and quantitatively on five synthesized datasets. Additionally, FacePaint can be effectively applied to real scenes, which can generate high-fidelity results from occluded face images by different objects in the wild.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103092"},"PeriodicalIF":3.7,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144222929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on the design of online museum exhibition system based on user experience evaluation and eye movement experiment analysis 基于用户体验评价和眼动实验分析的在线博物馆展示系统设计研究

IF 3.7 2区工程技术

Displays Pub Date : 2025-06-04 DOI: 10.1016/j.displa.2025.103107

Linhui Hu , Siyin Liao , Lidan Chen , Qian Shan , Tao Liu , Kang Shen

{"title":"Research on the design of online museum exhibition system based on user experience evaluation and eye movement experiment analysis","authors":"Linhui Hu , Siyin Liao , Lidan Chen , Qian Shan , Tao Liu , Kang Shen","doi":"10.1016/j.displa.2025.103107","DOIUrl":"10.1016/j.displa.2025.103107","url":null,"abstract":"<div><div>As digital technologies continue to evolve, museums are optimizing their websites to enhance online visitor experience and encourage repeat engagement. This study examines user experiences across three widely adopted systems—Panoramic Roaming, Slide Graphics, and Digital Catalog—using a mixed-methods approach combining questionnaires and eye-tracking experiments. Results reveal a discrepancy between eye-tracking metrics and self-reported perceptions, indicating cognitive dissonance between attention-based engagement and subjective satisfaction. While system characteristics showed significant variations in authenticity and participation, only authenticity and overall experiential quality (pleasantness, education, aesthetics) significantly predicted user attitudes; participation and usability demonstrated no significant effects. These findings highlight the primacy of emotional-cognitive factors in shaping digital exhibition attitudes, suggesting designers should prioritize perceptual authenticity and holistic experience quality over technical features alone.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103107"},"PeriodicalIF":3.7,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144240498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0