DisplaysPub Date : 2025-01-02DOI: 10.1016/j.displa.2024.102953
Qian Chen , Lvhai Chen , Wenjie Nie , Xudong Li , Jingyuan Zheng , Jiajun Zhong , Yihua Wei , Yan Zhang , Rongrong Ji
{"title":"A mixed-scale dynamic attention transformer for pediatric pneumonia diagnosis","authors":"Qian Chen , Lvhai Chen , Wenjie Nie , Xudong Li , Jingyuan Zheng , Jiajun Zhong , Yihua Wei , Yan Zhang , Rongrong Ji","doi":"10.1016/j.displa.2024.102953","DOIUrl":"10.1016/j.displa.2024.102953","url":null,"abstract":"<div><div>Pediatric pneumonia is a leading cause of morbidity and mortality in children under five, emphasizing the urgent need for automated diagnostic systems. While deep learning has shown promise in natural image classification, pediatric pneumonia imaging presents unique challenges, such as subtle symptoms, smaller anatomical structures, and the need for fine-grained feature extraction. To address this, We propose a Mixed-Scale Dynamic Attention Transformer aided by large language models (LLMs), which consists of three key modules: (1) Dynamic Local Attention Module: Dynamically focuses on nearby regions with fine-grained attention and applies coarse-grained attention to distant areas, effectively capturing both local and global spatial dependencies. (2) Hierarchical Multi-Scale Unit Module: Integrates and enhances multi-scale channel information, adapting to varying spatial scales to better detect subtle pneumonia-related features. (3) Attention Amplification Module: Leverages a frozen large language model (e.g., GPT, LLaMA) to amplify attention on critical pneumonia features by utilizing its rich semantic insights and global contextual understanding. Evaluations on pediatric chest X-ray datasets, including Pneumonia Physician, Guangzhou Women and Children’s Medical Center, and NIH CXR14, demonstrate the proposed method’s superior performance across key metrics such as accuracy, AUC, precision, recall, and F1-score.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102953"},"PeriodicalIF":3.7,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-31DOI: 10.1016/j.displa.2024.102963
Hao Liu , Renhui Sun
{"title":"Uniform-reference threshold-dynamic skipping for video compressive sensing","authors":"Hao Liu , Renhui Sun","doi":"10.1016/j.displa.2024.102963","DOIUrl":"10.1016/j.displa.2024.102963","url":null,"abstract":"<div><div>Block-based Compressive Sensing (BCS) can complete the compression of original signal during the sampling process, and thus reduce the computational burden at the encoder. BCS is suitable for some scenarios where encoding-end resources are limited, such as applications in the field of drone photography. For video signals, there is a high similarity between adjacent frames, so some researchers have proposed to perform block skipping at the encoder under the GOP-BCS framework to further compress the data that needs to be transmitted to the decoder. The reference and selection of skip-blocks are related to the reconstruction quality of the decoder and the compression ratio at the encoder. This paper proposes a Uniform-reference Threshold-dynamic Skipping (UTS) algorithm. Firstly, the proposed algorithm sets a dynamic threshold to select skip-blocks, which is suitable for video sequences with different motion variations. Secondly, in a general GOP framework, keyframes and prime non-keyframe in the middle are used as reference frames, so that the reference frames are uniformly distributed, which can provide accurate skip-block reference for more non-keyframes. At the same time, a high threshold is set for the prime non-keyframe to select skip-blocks to ensure its reliability as a reference frame and further improve the skipping ratio. The experimental results show that compared with the state-of-the-art algorithms, the proposed algorithm has a higher skipping ratio and effectively reduce the energy consumption of signal transmission when the same reconstruction quality is required at the decoder.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102963"},"PeriodicalIF":3.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Crypto-space steganography for 3D mesh models with greedy selection and shortest expansion","authors":"Kai Gao , Ji-Hwei Horng , Ching-Chun Chang , Chin-Chen Chang","doi":"10.1016/j.displa.2024.102961","DOIUrl":"10.1016/j.displa.2024.102961","url":null,"abstract":"<div><div>Data hiding in encrypted 3D mesh models has emerged as a promising crypto-space steganography technique. However, the existing methods have the potential to improve embedding capacity due to the underutilization of the model’s topological features. In this paper, we propose an innovative greedy selection and shortest expansion strategy to select a proper reference set of vertices. Subsequently, the multi-MSB prediction and entropy coding are leveraged to further reduce the redundancy in the vertex coordinates for data embedding. By combining the new strategy and the efficient compressing of the embeddable vertices, we can raise the vertex utilization rate to approximately 90%. Experimental results show that our proposed scheme outperforms state-of-the-art methods, offering a substantial improvement in data payload for reversible data hiding in encrypted 3D mesh models.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102961"},"PeriodicalIF":3.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-31DOI: 10.1016/j.displa.2024.102942
Sen Xiang , Dasheng Huang , Haibing Yin , Hongkui Wang , Li Yu
{"title":"Single image super-resolution with channel attention and diffusion","authors":"Sen Xiang , Dasheng Huang , Haibing Yin , Hongkui Wang , Li Yu","doi":"10.1016/j.displa.2024.102942","DOIUrl":"10.1016/j.displa.2024.102942","url":null,"abstract":"<div><div>Single image super-resolution (SISR) is a fundamental vision task that facilitates a range of applications. In SISR, human perception, objective quality index, and complexity are three main concerns. In this paper, we propose a new SISR framework known as single image super-resolution with channel attention and diffusion (SRCAD). The proposed SRCAD combines the stochastic iterative mechanism of the denoising diffusion probabilistic model (DDPM) with advanced feature encoding techniques. With the guidance of encoded features, the diffusion model better predicts SR images with more details, and the model size is also reduced as well. To be specific, in feature encoding, SRCAD introduces a pre-trained encoder combined with dimensional interleaved product channel attention (DIP-CA), which extracts key features at a low computational cost. The extracted deep features guide iterative denoising and are combined with distribution-aware feature subset fusion (DFSF) to reduce the data dimension of the features. Experimental results demonstrate that SRCAD performs well on four datasets and two SR tasks. It also outperforms other state-of-the-art models in terms of objective quality metrics, including PSNR, SSIM, and LR-PSNR. Besides, it also reduces the number of parameters, thus delivering high performance with low complexity.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102942"},"PeriodicalIF":3.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-31DOI: 10.1016/j.displa.2024.102950
Bei-hua Zhang , Qian-ru Jiang , Chao-shan Yan , Ran Tao
{"title":"Interpersonal synchronization and eye-tracking in children with autism spectrum disorder: A systematic review","authors":"Bei-hua Zhang , Qian-ru Jiang , Chao-shan Yan , Ran Tao","doi":"10.1016/j.displa.2024.102950","DOIUrl":"10.1016/j.displa.2024.102950","url":null,"abstract":"<div><h3>Background:</h3><div>We aimed at synthesizing recent literature on interpersonal synchronization (IPS) and eye tracking (ET) in children and adolescents with autism spectrum disorder (ASD) during face-to-face real-time social IPS tasks, how atypical their performance is compared to typically developing (TD) children, and at analyzing factors influencing eye movements, ET performance, patterns of cortical in ASD children.</div></div><div><h3>Method:</h3><div>Web of Science, PubMed, ScienceDirect, Embase and Cochrane Library databases were searched. The review is registered with PROSPERO (CRD42021245383).</div></div><div><h3>Results:</h3><div>11 of the 532 retrieved papers were included. 7 studies applied eye-tracking and 4 studies applied functional near infrared spectroscopy (fNIRS). Children with ASD were often characterized by atypical gaze patterns in the 7 eye tracking studies, that is, a marked decrease in the focus on the faces of adults and an increase in the focus on the backgrounds. In 4 fNIRS studies, children with ASD showed poor movement synchronization, hypoactivity of superior temporal sulcus (STS), inferior frontal gyrus (IFG), middle and inferior frontal gyri (MIFG), middle and superior temporal gyri (MSTG), and hyperactivity of inferior parietal lobe (IPL). The activation of STS, IFG, MIFG and MSTG were associated with social performance, communication performance and ASD severity.</div></div><div><h3>Conclusion:</h3><div>Recognizing the early atypical development of eye movements and function in STS, IFG, MIFG, MSTG and IPL in children with ASD is an important factor in how clinicians, therapists and social workers treat and intervene with this population. Although, for the time being, it is difficult to draw conclusions with a high level of evidence due to the heterogeneity within ASD, inconsistencies in study design and the lack of multicenter randomized controlled studies. At present, preliminary investigations of behavior and neurobiological markers of ASD children in real-time face-to-face social IPS tasks are underway. In addition, we analyze their relevant influencing factors, which can contribute to guide the design of more targeted experimental paradigms and intervention practice.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102950"},"PeriodicalIF":3.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143162416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-30DOI: 10.1016/j.displa.2024.102959
Bingjia Zhao , Yangyang Zhu , Yacheng Xu , Xuande Yang , Jie Li , Yanqiong Zheng , Bin Wei , Wei Shi , Chenchen Li , Siqi Zhang
{"title":"Efficient p-i-n-p blue OLED devices by inserting organic heterojunctions for charge generation","authors":"Bingjia Zhao , Yangyang Zhu , Yacheng Xu , Xuande Yang , Jie Li , Yanqiong Zheng , Bin Wei , Wei Shi , Chenchen Li , Siqi Zhang","doi":"10.1016/j.displa.2024.102959","DOIUrl":"10.1016/j.displa.2024.102959","url":null,"abstract":"<div><div>An n-p type organic heterojunction composed of various materials was inserted between the Al electrode and the electron injection layer to fabricate an optimized p-i-n-p blue OLED device. The n-p type organic heterojunction generate carriers under reverse bias, reducing the injection barrier for carriers and regulating carrier balance. A p-n junction carrier device was fabricated to investigate their carrier transport properties. The energy levels of the p-layer and n-layer were analyzed using ultraviolet photo-electron spectroscopy (UPS) and low-energy inverse photoelectron spectroscopy (LEIPS). The p-i-n-p OLED with BPhen:15 %Ag<sub>2</sub>O (20 nm)/NPB:10 %MoO<sub>3</sub> (10 nm) as the heterojunction shows a remarkable 0.6 V reduction in turn-on voltage. Current efficiency (CE), power efficiency (PE), and external quantum efficiency (EQE) increase by 29.3 %, 41.2 %, and 12.5 %, respectively, compared to the basic p-i-n OLED. Finally, FTIR spectra indicate a charge generation in the p-n junction, enabling a balanced carrier distribution.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102959"},"PeriodicalIF":3.7,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-30DOI: 10.1016/j.displa.2024.102951
Qian Chen , Zihang Lin , Xudong Li , Jingyuan Zheng , Yan Zhang , Rongrong Ji
{"title":"Multi-scale and contrastive learning for pediatric chest radiograph classification tasks","authors":"Qian Chen , Zihang Lin , Xudong Li , Jingyuan Zheng , Yan Zhang , Rongrong Ji","doi":"10.1016/j.displa.2024.102951","DOIUrl":"10.1016/j.displa.2024.102951","url":null,"abstract":"<div><div>Pediatric medical image classification faces enormous challenges due to the subtlety of children’s physiology, the subtle manifestations of pathological changes, and the urgent need for accurate and timely diagnosis. This complexity is further exacerbated by the high variability in image quality, the small sample sizes of rare diseases, and the need for models to generalize well over diverse and often limited datasets. Addressing these challenges is imperative to improve pediatric healthcare outcomes. To this end, this paper proposes a model that combines contrastive learning and multi-scale theory, which simulates the behavior of the eye zooming in and out of an image when a physician is looking at a medical imaging picture. First, we zoom in and out the image and then perform feature extraction and blending by feature encoder and scale integration unit for the purpose of learning the fine texture and global feature of the lesion. At the same time, we write a series of texts for the disease category that needs to be diagnosed and get its features through text encoder. Considering the further fusion of image features, we also introduce a frozen LLM block to do it. Finally, we use text features and image features for similarity computation, the crucial step of contrastive learning, and obtain the final categories. On four public datasets, our proposed model performs excellently and outperforms existing SOTA methods. In addition, our model also performs well in generalized proficiency testing, particularly in IQA. With this work, we aim to open new avenues for the use of contrastive learning and multi-scale theory in pediatric medical imaging and to enrich the understanding of its potential in this specialized field.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102951"},"PeriodicalIF":3.7,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-28DOI: 10.1016/j.displa.2024.102943
You-Sheng Zhang, Li-Chen Ou
{"title":"Simulated bobbing in walking and running in virtual reality","authors":"You-Sheng Zhang, Li-Chen Ou","doi":"10.1016/j.displa.2024.102943","DOIUrl":"10.1016/j.displa.2024.102943","url":null,"abstract":"<div><div>In this study, a psychophysical experiment was carried out to develop and evaluate a bobbing mechanism that can replicate natural human walking and running movement in virtual reality (VR), aiming to enhance realism and user comfort. Subjects were required to adjust oscillation amplitudes to determine the most comfortable level of bobbing during walking and running in VR while in real world the subjects remained still. The experimental results indicate clear preference for vertical oscillation with subjects’ minimal variation across different velocities, emphasising the necessity of simulated bobbing to enhance realism and comfort without inducing discomfort. The average vertical oscillation amplitude was ±0.695 cm for walking and ±1.351 cm for running. Lateral rotation showed more variability, with an average roll amplitude of ±0.186°for walking and ±0.334°for running. These findings suggest that vertical oscillation is critical for providing more immersive and comfortable VR experiences, and that the bobbing simulation might be a useful tool for addressing VR cybersickness.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102943"},"PeriodicalIF":3.7,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-12-28DOI: 10.1016/j.displa.2024.102940
Shangshang Zhang , Weixing Su , Fang Liu , Lincheng Sun
{"title":"Review of stereo matching based on deep learning","authors":"Shangshang Zhang , Weixing Su , Fang Liu , Lincheng Sun","doi":"10.1016/j.displa.2024.102940","DOIUrl":"10.1016/j.displa.2024.102940","url":null,"abstract":"<div><div>In recent years, the introduction of deep learning has greatly advanced computer vision technology. Stereo matching, as a key technique, has shown impressive performance with the help of deep neural networks, leading to a significant amount of research. Compared to traditional methods, stereo matching based on deep learning differs in many ways. Therefore, this paper provides a comprehensive review of the latest developments in this field. We take a novel perspective by categorizing these algorithms into four branches based on how they handle the left and right views in stereo matching networks. We outline the evolution and prospects of each branch, summarizing and analyzing the outstanding methods in each, exploring their advantages, disadvantages, and main challenges. We also identify unresolved issues in each branch and discuss future research directions.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102940"},"PeriodicalIF":3.7,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143162414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A mental fatigue assessment method for pilots incorporating multiple ocular features","authors":"Huining Pei, Guiyang Li, Yujie Ma, Hao Gong, Mingzhe Xu, Zhonghang Bai","doi":"10.1016/j.displa.2024.102956","DOIUrl":"10.1016/j.displa.2024.102956","url":null,"abstract":"<div><div>As the mental fatigue of pilots is difficult to assess without contact, this work proposes a mental fatigue assessment model that incorporates the BlazeFace algorithm and a generalized regression neural network (GRNN) improved by a genetic algorithm (GA). First, the flight video of the pilot is converted into a dataset of frame sequences, and the images are preprocessed using homomorphic filtering. The BlazeFace face detection algorithm is then used to detect faces frame by frame, achieving the localization of facial feature points. Next, since different features have varying degrees of correlation with mental fatigue, PERCLOS, average fixation duration, and pupil area are weighted through experts evaluation and fused into a composite feature M, which serves as the input for the mental fatigue classification model. Finally, the mental fatigue level is determined frame by frame using a generalized regression neural network (GRNN) optimized by a genetic algorithm (GA). The number of frames identified as representing a state of mental fatigue over a certain period is then calculated, and the corresponding level of pilot mental fatigue is output based on the number of frames in a fatigued state. The results show that the proposed BlazeFace-GA-GRNN model is able to better assess pilot mental fatigue with improved accuracy and effectiveness. It has theoretical reference significance for improving the safety management of pilot mental fatigue.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102956"},"PeriodicalIF":3.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143163091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}