{"title":"A multimodal deep learning framework for automated major adverse cardiovascular events prediction in patients with end-stage renal disease integrating clinical and cardiac MRI data","authors":"Xinyue Sun , Siyu Guan , Lianming Wu , Tianyi Zhang , Liang Ying","doi":"10.1016/j.displa.2025.102998","DOIUrl":"10.1016/j.displa.2025.102998","url":null,"abstract":"<div><div>Making the accurate prediction of Major Adverse Cardiovascular Events (MACE) in End-Stage Renal Disease (ESRD) patients is crucial for early intervention and clinical management. Traditional methods for MACE prediction are limited by measurement precision and annotation difficulties, resulting in inherent uncertainties in prediction accuracy. To address these challenges, this paper proposes an automatic multimodal deep learning framework that integrates clinical data, patient history, and cardiac magnetic resonance imaging (MRI) data to precisely predict the probability of MACE in ESRD patient occurrence. The system employs automatic seed generation and region localization on 2D slices, followed by 3D convolutional neural networks (CNNs) to extract both local and global features. Additionally, it incorporates clinical test indicators and medical history data, optimizing the weight distribution among features through a gating mechanism. This approach significantly enhances the accuracy and efficiency of MACE in ESRD patient prediction, demonstrating excellent performance on the dataset composed of 176 cardiovascular cases, with an average accuracy of 0.82 in five-fold cross-validation. It is capable of processing large-scale data without requiring physician involvement in labeling, offering substantial potential for clinical application.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102998"},"PeriodicalIF":3.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Penta-channel waveguide-based near-eye display with two-dimensional pupil expansion","authors":"Chao Ping Chen, Xiaojun Wu, Jinfeng Wang, Baoen Han, Yunfan Yang, Shuxin Liu","doi":"10.1016/j.displa.2025.102999","DOIUrl":"10.1016/j.displa.2025.102999","url":null,"abstract":"<div><div>We present a penta-channel waveguide-based near-eye display as an ultra-wide-angle architecture for the metaverse. The core concept is to divide one field of view into five by placing the couplers within the regions, where only the subsets of field of view are located. Compared to its counterparts, including the single, double, triple and quad channels, our penta-channel waveguide can push the envelope of field of view further. With the aid of <em>k</em>-space diagram, the upper limit of field of view is illustrated and deduced. The design rules of the waveguide, 4-level grating as the in-coupler, and two-dimensional binary grating as the out-coupler are expounded. Through the rigorous coupled-wave analysis, the efficiencies of gratings can be calculated and optimized. As an overall evaluation, its key performance indicators are summarized as follows. Field of view is 109° (diagonal), eye relief is 10 mm, exit pupil is 6.2 × 6.2 mm<sup>2</sup>, and pupil uniformity is 54 %.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102999"},"PeriodicalIF":3.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ASR-NeSurf: Alleviating structural redundancy in neural surface reconstruction for deformable endoscopic tissues by validity probability","authors":"Qian Zhang , Jianping Lv , Jia Gu , Yingtian Li , Wenjian Qin","doi":"10.1016/j.displa.2025.103000","DOIUrl":"10.1016/j.displa.2025.103000","url":null,"abstract":"<div><div>Accurate reconstruction of dynamic mesh models of human deformable soft tissues in surgical scenarios is critical for a variety of clinical applications. However, due to the challenges of limited sparse views, weak image texture information, uneven illumination intensity and large lens distortion in endoscopic video, the traditional 3D reconstruction methods based on depth estimation and SLAM fail to accurate surface reconstruction. Existing neural radiance field methods, such as Endosurf, have been developed for this problem, while these methods still suffer from inaccurate generation of mesh models with structural redundancy due to limited sparse views. In this paper, we propose a novel neural surface reconstruction method for deformable soft tissues from endoscopic videos, named ASR-NeSurf. Specifically, our approach modifies the volume rendering process by introducing the neural validity probability field to predict the probability of redundant structures. Further, unbiased validity probability volume rendering is employed to generate high-quality geometry and appearance. Experiments on three public datasets with variation of sparse-view and different degrees of deformation demonstrate that ASR-NeSurf significantly outperforms the state-of-the-art neural-field-based method, particularly in reconstructing high-fidelity mesh models.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103000"},"PeriodicalIF":3.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-16DOI: 10.1016/j.displa.2025.102997
Tianxi Yang , Jie Sun , Yijian Zhou , Yuchen Lu , Jin Li , Zhonghang Huang , Chang Lin , Qun Yan
{"title":"2822 PPI active matrix micro-LED display fabricated via Au-Au micro-bump bonding technology","authors":"Tianxi Yang , Jie Sun , Yijian Zhou , Yuchen Lu , Jin Li , Zhonghang Huang , Chang Lin , Qun Yan","doi":"10.1016/j.displa.2025.102997","DOIUrl":"10.1016/j.displa.2025.102997","url":null,"abstract":"<div><div>Currently, due to their cost-effectiveness and excellent physical properties, indium and tin are frequently utilized as bump materials for micro-light emitting diodes (Micro-LEDs) and silicon complementary metal–oxide–semiconductor (CMOS) devices to realize flip-chip bonding technology. However, as micro-LED pixel sizes and spacings decrease, forming indium and tin bumps that meet bonding requirements becomes challenging. These bumps are difficult to form an ideal spherical shape in the reflow process and easy to cause interconnection problems between adjacent pixels, adversely affecting device performance. To address this, we propose a novel Au-Au bump technology for micro-LED flip-chip bonding. This technology aims to effectively avoid interconnection issues while simplifying the micro-LED process flow and reducing production costs. Therefore, this paper designed a micro-LED device with 2822 PPI, 640 × 360 resolution, and 9 μm pixel pitch to verify the feasibility of Au-Au micro-bump bonding. During this process, Au bump with diameter of 3.9 μm and 6.5 μm were fabricated for micro-LED array and CMOS driver chip respectively, followed by integrating them using the flip-chip bonding process. Cross-sectional analysis confirmed the high reliability and stability of the Au-Au connection, enabling the micro-LED device to function properly. Furthermore, the Au bump micro-LED exhibits greater electroluminescence (EL) intensity and brightness than the In bump micro-LED, potentially due to the optical losses incurred during the preparation of indium bumps within the micro-LED chip.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102997"},"PeriodicalIF":3.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143436887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-14DOI: 10.1016/j.displa.2025.102993
Zaka-Ud-Din Muhammad , Zhangjin Huang , Naijie Gu
{"title":"PCRNet: Parent–Child Relation Network for automatic polyp segmentation","authors":"Zaka-Ud-Din Muhammad , Zhangjin Huang , Naijie Gu","doi":"10.1016/j.displa.2025.102993","DOIUrl":"10.1016/j.displa.2025.102993","url":null,"abstract":"<div><div>Colorectal cancer (CRC) is the third most common cancer worldwide in terms of both incidence and mortality rates. On the other hand, its slow development process is very beneficial for early diagnosis and effective treatment strategies in reducing mortality rates. Colonoscopy is considered the standard approach for early diagnosis and treatment of the disease. However, detecting early-stage polyps remains challenging with the current standard colonoscopy approach due to the diverse shapes, sizes, and camouflage properties of polyps.</div><div>To address the issues posed by the different shapes, sizes, colors, and hazy boundaries of polyps, we propose the Parent–Child Relation Encoder Network (PCRNet), a lightweight model for automatic polyp segmentation. PCRNet comprises a parent–child encoder branch and a decoder branch equipped with a set of Boundary-aware Foreground Extraction Blocks (BFEB). The child encoder is designed to enhance feature representation while considering model size and computational complexity. The BFEB is introduced to accurately segment polyps of varying shapes and sizes by effectively handling the issue of hazy boundaries.</div><div>PCRNet is evaluated both quantitatively and qualitatively on five public datasets, demonstrating its effectiveness compared to more than a dozen state-of-the-art techniques. Our model is the most lightweight among current approaches, with only (5.0087) million parameters, and achieves the best Dice Score of (0.729%) on the most challenging dataset, ETIS. PCRNet also has an average inference rate of (36.5) fps on an <span><math><mrow><mi>I</mi><mi>n</mi><mi>t</mi><mi>e</mi><mi>l</mi></mrow></math></span>® <span><math><mrow><mi>C</mi><mi>o</mi><mi>r</mi><msup><mrow><mi>e</mi></mrow><mrow><mi>T</mi><mi>M</mi></mrow></msup></mrow></math></span> i7-10700K CPU with 62 GB of memory, using a GeForce RTX 3080 (10 GB).</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 102993"},"PeriodicalIF":3.7,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-10DOI: 10.1016/j.displa.2025.102983
Zhihua Shen , Fei Li , Yiqiang Wu , Xiaomao Li
{"title":"Ghost-free high dynamic range imaging with shift convolution and streamlined channel transformer","authors":"Zhihua Shen , Fei Li , Yiqiang Wu , Xiaomao Li","doi":"10.1016/j.displa.2025.102983","DOIUrl":"10.1016/j.displa.2025.102983","url":null,"abstract":"<div><div>High dynamic range (HDR) imaging merges multiple low dynamic range (LDR) images to generate an image with a wider dynamic range and more authentic details. However, existing HDR algorithms often produce residual ghosts due to challenges in capturing long-range dependencies in scenes with large motion and severe saturation. To address these issues, we propose an HDR deghosting method with shift convolution and a streamlined channel Transformer (SCHDRNet). Specifically, to better aggregate information across frames, we propose a pixel-shift alignment module (PSAM) to enhance the interaction of adjacent pixel features through shift convolution, improving the accuracy of the attention alignment module (AAM). Additionally, we propose a hierarchical streamlined channel Transformer (SCT) that integrates streamlined channel attention, multi-head self-attention, and channel attention blocks. This architecture effectively captures both global and local context, reducing ghosting from large motions and blurring from small movements. Extensive experiments demonstrate that our method minimizes ghosting artifacts and excels in quantitative and qualitative aspects.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102983"},"PeriodicalIF":3.7,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143421119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-09DOI: 10.1016/j.displa.2025.102994
Wanxuan Geng , Junfan Yi , Liang Cheng
{"title":"An efficient detector for maritime search and rescue object based on unmanned aerial vehicle images","authors":"Wanxuan Geng , Junfan Yi , Liang Cheng","doi":"10.1016/j.displa.2025.102994","DOIUrl":"10.1016/j.displa.2025.102994","url":null,"abstract":"<div><div>Unmanned aerial vehicle (UAV) remote sensing has the advantages of responsive and high image resolution, which can better serve the object detection of maritime search and rescue (SAR). However, there are still some obstacles in maritime SAR object detection based on UAV images, due to the lack of samples for training and the complexity background of the maritime images. In this study, we build a maritime search and rescue target dataset (MSRTD) based on UAV images and further propose an efficient multi-category detector named Maritime Search and Rescue-You Only Look Once network (MSR-YOLO). To eliminate the influence of objects scale and shooting angle, we introduce the deformable convolution network (DCN) to modules in backbone. The Coordinated Attention (CA) is added to the neck of network to extract the powerful features. We replace the original detection head with decoupled detection head to better complete the task of object recognition and localization. Finally, we use Wise-Intersection over Union loss (WIoU) during the training to reduce the influence of the samples quality and help model converges rapidly. The experiments on MSRTD confirm that the proposed MSR-YOLO achieves precision, recall, and mean average precision (mAP) (0.5) of 90.00%, 68.52%, and 79.98% respectively. Compared with other methods on public dataset, ours also performs well and provides an effective detector model for maritime SAR object detection based on UAV images.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102994"},"PeriodicalIF":3.7,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143421120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-08DOI: 10.1016/j.displa.2025.102990
Jiantao Zhang , Bojun Ren , Yicheng Fu , Rongbo Ma , Zinuo Cai , Weishan Zhang , Ruhui Ma , Jinshan Sun
{"title":"HyperTuneFaaS: A serverless framework for hyperparameter tuning in image processing models","authors":"Jiantao Zhang , Bojun Ren , Yicheng Fu , Rongbo Ma , Zinuo Cai , Weishan Zhang , Ruhui Ma , Jinshan Sun","doi":"10.1016/j.displa.2025.102990","DOIUrl":"10.1016/j.displa.2025.102990","url":null,"abstract":"<div><div>Deep learning has achieved remarkable success across various fields, especially in image processing tasks like denoising, sharpening, and contrast enhancement. However, the performance of these models heavily relies on the careful selection of hyperparameters, which can be a computationally intensive and time-consuming task. Cloud-based hyperparameter search methods have gained popularity due to their ability to address the inefficiencies of single-machine training and the underutilization of computing resources. Nevertheless, these methods still encounters substantial challenges, including high computational demands, parallelism requirements, and prolonged search time.</div><div>In this study, we propose <span>HyperTuneFaaS</span>, a Function as a Service (FaaS)-based hyperparameter search framework that leverages distributed computing and asynchronous processing to tackle the issues encountered in hyperparameter search. By fully exploiting the parallelism offered by serverless computing, <span>HyperTuneFaaS</span> minimizes the overhead typically associated with model training on serverless platforms. Additionally, we enhance the traditional genetic algorithm, a powerful metaheuristic method, to improve its efficiency and integrate it with the framework to enhance the efficiency of hyperparameter tuning. Experimental results demonstrate significant improvements in efficiency and cost savings with the combination of the FaaS-based hyperparameter tuning framework and the optimized genetic algorithm, making <span>HyperTuneFaaS</span> a powerful tool for optimizing image processing models and achieving superior image quality.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102990"},"PeriodicalIF":3.7,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143395684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-07DOI: 10.1016/j.displa.2025.102987
Yunhao Li , Sijing Wu , Yucheng Zhu , Wei Sun , Zhichao Zhang , Song Song , Guangtao Zhai
{"title":"SAMR: Symmetric masked multimodal modeling for general multi-modal 3D motion retrieval","authors":"Yunhao Li , Sijing Wu , Yucheng Zhu , Wei Sun , Zhichao Zhang , Song Song , Guangtao Zhai","doi":"10.1016/j.displa.2025.102987","DOIUrl":"10.1016/j.displa.2025.102987","url":null,"abstract":"<div><div>Recently, text to 3d human motion retrieval has been a hot topic in computer vision. However, current existing methods utilize contrastive learning and motion reconstruction as the main proxy task. Although these methods achieve great performance, such simple strategies may cause the network to lose temporal motion information and distort the text feature, which may injury motion retrieval results. Meanwhile, current motion retrieval methods ignore the post processing for predicted similarity matrices. Considering these two problems, in this work, we present <strong>SAMR</strong>, an encoder–decoder based transformer framework with symmetric masked multi-modal information modeling. Concretely, we remove the KL divergence loss and reconstruct the motion and text inputs jointly. To enhance the robustness of our retrieval model, we also propose a mask modeling strategy. Our SAMR performs joint masking on both image and text inputs, during training, for each modality, we simultaneously reconstruct the original input modality and masked modality to stabilize the training. After training, we also utilize the dual softmax optimization method to improve the final performance. We conduct extensive experiments on both text-to-motion dataset and speech-to-motion dataset. The experimental results demonstrate that SAMR achieves the state-of-the-art performance in various cross-modal motion retrieval tasks including speech to motion and text to motion, showing great potential to serve as a general foundation motion retrieval framework.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102987"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143421118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2025-02-07DOI: 10.1016/j.displa.2025.102981
Xingliang Zhu , Xiaoyu Dong , Weiwei Yu , Huawei Liang , Bin Kong
{"title":"Refactored Maskformer: Refactor localization and classification for improved universal image segmentation","authors":"Xingliang Zhu , Xiaoyu Dong , Weiwei Yu , Huawei Liang , Bin Kong","doi":"10.1016/j.displa.2025.102981","DOIUrl":"10.1016/j.displa.2025.102981","url":null,"abstract":"<div><div>The introduction of DEtection TRansformers (DETR) has marked a new era for universal image segmentation in computer vision. However, methods that use shared queries and attention layers for simultaneous localization and classification often encounter inter-task optimization conflicts. In this paper, we propose a novel architecture called <strong>Refactored Maskformer</strong>, which builds upon the Mask2Former through two key modifications: Decoupler and Reconciler. The Decoupler separates decoding pathways for localization and classification, enabling task-specific query and attention layer learning. Additionally, it employs a unified masked attention to confine the regions of interest for both tasks within the same object, along with a query Interactive-Attention layer to enhance task interaction. In the Reconciler module, we mitigate the optimization conflict issue by introducing localization supervised matching cost and task alignment learning loss functions. These functions aim to encourage high localization accuracy samples, while reducing the impact of high classification confidence samples with low localization accuracy on network optimization. Extensive experimental results demonstrate that our Refactored Maskformer achieves performance comparable to existing state-of-the-art models across all unified tasks, surpassing our baseline network, Mask2former, with 1.2% PQ on COCO, 6.8% AP on ADE20k, and 1.1% mIoU on Cityscapes. The code is available at <span><span>https://github.com/leonzx7/Refactored-Maskformer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102981"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143376929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}