DisplaysPub Date : 2024-06-25DOI: 10.1016/j.displa.2024.102786
Jinsong Zhang , Yu-Kun Lai , Jian Ma , Kun Li
{"title":"Multi-scale information transport generative adversarial network for human pose transfer","authors":"Jinsong Zhang , Yu-Kun Lai , Jian Ma , Kun Li","doi":"10.1016/j.displa.2024.102786","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102786","url":null,"abstract":"<div><p>Human pose transfer, a challenging image generation task, aims to transfer a source image from one pose to another. Existing methods often struggle to preserve details in visible regions or predict reasonable pixels for invisible regions due to inaccurate correspondences. In this paper, we design a novel multi-scale information transport generative adversarial network, composed of Information Transport (IT) blocks to establish and refine the correspondences progressively. Specifically, we compute a transport matrix to warp the source image features by integrating an optimal transport solver in our proposed IT block, and use IT blocks to refine the correspondences in different resolutions to preserve rich details of the source image features. The experimental results and applications demonstrate the effectiveness of our proposed method. We further present an image-specific optimization using only a single image. <em>The code is available for research purposes at</em> <span>https://github.com/Zhangjinso/OT-POSE</span><svg><path></path></svg>.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102786"},"PeriodicalIF":3.7,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-25DOI: 10.1016/j.displa.2024.102781
Lei Chen , Junkun Long , Rongkai Shi , Ziming Li , Yong Yue , Lingyun Yu , Hai-Ning Liang
{"title":"Exploration of exocentric perspective interfaces for virtual reality collaborative tasks","authors":"Lei Chen , Junkun Long , Rongkai Shi , Ziming Li , Yong Yue , Lingyun Yu , Hai-Ning Liang","doi":"10.1016/j.displa.2024.102781","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102781","url":null,"abstract":"<div><p>Exocentric views play a pivotal role in computer-mediated collaboration, especially in Collaborative Virtual Environments (CVEs), where focusing on the actions and operations of collaboration partners is crucial. The exocentric perspective offers users a vantage point to ascertain the whereabouts and actions of their partners, enhancing spatial awareness and social presence in CVEs. Moreover, interacting via the Exocentric Perspective Interface (ExPI) can help users complete searching and manipulation tasks remotely and efficiently. This work investigates the potential benefits of two representative ExPIs, World In Miniature (WIM) and 2D Map, for VR collaboration. We conducted a user study with 36 participants (18 pairs) to compare WIM and the 2D Map against a baseline in a VR collaborative task encompassing a series of searching and manipulation tasks with different task complexities (Simple, Medium, and Complex). For the Baseline (BL) condition, participants were not provided with an Exocentric Perspectives Interface (ExPI) but instead were given a map of the virtual environment (VE). The results indicate that these two ExPIs significantly improved task performance, usability, social presence, and user experience while reducing VR sickness. In addition, we also found that WIM outperformed 2D Maps, especially in complex collaborative environments. Based on the findings, three design implications are proposed to guide the design of future VR collaboration systems.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102781"},"PeriodicalIF":3.7,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141541015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-25DOI: 10.1016/j.displa.2024.102766
Tomoyoshi Shimobaba , David Blinder , Tatsuki Tahara , Fan Wang , Takashi Nishitsuji , Atsushi Shiraki , Chau-Jern Cheng , Tomoyoshi Ito
{"title":"Diffraction calculations from real-to-complex, complex-to-real, and real-to-real fields","authors":"Tomoyoshi Shimobaba , David Blinder , Tatsuki Tahara , Fan Wang , Takashi Nishitsuji , Atsushi Shiraki , Chau-Jern Cheng , Tomoyoshi Ito","doi":"10.1016/j.displa.2024.102766","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102766","url":null,"abstract":"<div><p>Conventional diffraction calculations typically employ complex Fourier transforms in which the source and target fields are represented by complex values. However, this approach is inefficient for certain applications. To address this problem, this study introduces diffraction calculations for three fields: real-to-complex, complex-to-real, and real-to-real. These calculations utilize a real-valued fast Fourier transform and Hermite symmetry, enabling accelerated computation by eliminating half of the spectra. This study also demonstrates the practical applications of these diffraction calculations. These applications include image reproduction in digital holography, speckle reduction in holographic projection, and accelerated hologram computation in holographic displays.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102766"},"PeriodicalIF":3.7,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-24DOI: 10.1016/j.displa.2024.102785
Siyi Xun , Mingfeng Jiang , Pu Huang , Yue Sun , Dengwang Li , Yan Luo , Huifen Zhang , Zhicheng Zhang , Xiaohong Liu , Mingxiang Wu , Tao Tan
{"title":"Chest CT-IQA: A multi-task model for chest CT image quality assessment and classification","authors":"Siyi Xun , Mingfeng Jiang , Pu Huang , Yue Sun , Dengwang Li , Yan Luo , Huifen Zhang , Zhicheng Zhang , Xiaohong Liu , Mingxiang Wu , Tao Tan","doi":"10.1016/j.displa.2024.102785","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102785","url":null,"abstract":"<div><p>In recent years, especially during the COVID-19 pandemic, a large number of Computerized Tomography (CT) images are produced every day for the purpose of inspecting lung diseases. However, the diagnosis accuracy depends on the quality of CT imaging and low quality images may greatly affect clinical diagnosis, resulting in misdiagnosis. It is difficult to effectively rate the quality of massive CT images. To solve the above problems, we first constructed a dataset of 800 CT volumes for chest CT image quality assessment. Then we propose a multi-task model for chest CT image quality assessment and classification. This model can automatically classify CT image sequences of different visual inspection windows, and automatically estimate CT image quality score, to match the visual score from clinicians. The experimental results show that the window classification accuracy and the dose exposure classification accuracy of our model can reach 0.8375 and 0.8813 respectively. The Pearson Linear Correlation Coefficient (PLCC) and Root Mean Square Error (RMSE) between the model prediction results and the two radiologist’s annotation average result reached 0.3288 and 1.9264. It shows that our model has a potential to mimic quality evaluation of experts.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102785"},"PeriodicalIF":3.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-24DOI: 10.1016/j.displa.2024.102789
Fa-ren Huo, Yi-ran Feng, Fei Fang
{"title":"Legibility of variable message signs on foggy highway: Effect of text color and spacing","authors":"Fa-ren Huo, Yi-ran Feng, Fei Fang","doi":"10.1016/j.displa.2024.102789","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102789","url":null,"abstract":"<div><p>Variable message signs are frequently utilized as highway communication facilities to inform vehicles about the state of the roads and the weather. However, standards and design specifications for variable message signs are unclear. This study examines the effect of the own factors and environmental factors on the legibility of variable message signs in order to optimize the design of variable message signs on highways. Three font colors (green, red, and yellow) and three levels of word spacing (0.1H, 0.3H, and 0.5H) were chosen for the variable message signs. For the environmental factors, three different weather types were represented according to visibility, namely sunny (visibility 10,000 m), light fog (visibility = 1000), and heavy fog (visibility = 200). The legibility time and distance were recorded for 23 participants, and then the legibility of the variable message signs was analyzed. The results showed that own factors and environmental factors influenced the legibility of the variable message signs. Experiment 1 results showed that the legibility of the variable message signs was the best when the text was yellow in light fog weather conditions. Experiment 2 results showed that the legibility of the variable message signs was better than other experimental scenarios in light fog conditions with a 0.3H word spacing. The findings of this study will be used by relevant agencies to establish design guidelines for variable message signs.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102789"},"PeriodicalIF":3.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141938224001537/pdfft?md5=8aea8b75df27115059f570e34c0d48db&pid=1-s2.0-S0141938224001537-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-22DOI: 10.1016/j.displa.2024.102783
Hongkui Wang , Yin Chen , Qi Ye , Zhun Li , Antong Pan , Haibing Yin , Liutao Wang , Jun Yin , Heng Jin , Li Yu , Wenyao Zhu , Xianghong Tang
{"title":"JND-based multi-module cooperative perceptual optimization for HEVC","authors":"Hongkui Wang , Yin Chen , Qi Ye , Zhun Li , Antong Pan , Haibing Yin , Liutao Wang , Jun Yin , Heng Jin , Li Yu , Wenyao Zhu , Xianghong Tang","doi":"10.1016/j.displa.2024.102783","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102783","url":null,"abstract":"<div><p>Since the just noticeable distortion (JND) reflects the tolerance limit of human visual system (HVS) to coding distortion directly, JND-based perceptual video coding (PVC) plays an increasingly significant role in video compression with the developing explosion of video data. In this paper, we focus on the coupling effect among coding modules and provide a JND-based multi-module cooperative perceptual optimization (JMCPO) scheme for HEVC. The main contribution of the proposed JMCPO scheme includes the following three aspects. (1) Based on quantization distortion estimation, an adaptive perceptual quantization scheme is proposed using the binary search approach on the premise of that the quantization distortion is infinitely close to the estimated JND threshold. (2) The coupling effect among coding modules is first analyzed and a novel perceptual residual filtering scheme is presented based on statistic analysis of coupling strength. (3) The JMCPO scheme is finally developed through collaborative optimization of residual filtering, quantization and rate–distortion optimization. Experimental results show that the proposed JMCPO scheme saves more bitrate with better subjective and objective qualities.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102783"},"PeriodicalIF":3.7,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-19DOI: 10.1016/j.displa.2024.102782
Wuzhen Shi , Jiajun Su , Yang Wen , Yutao Liu
{"title":"Light field image super-resolution using a Content-Aware Spatial–Angular Interaction network","authors":"Wuzhen Shi , Jiajun Su , Yang Wen , Yutao Liu","doi":"10.1016/j.displa.2024.102782","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102782","url":null,"abstract":"<div><p>Light field images record a series of viewpoints of a scene and thus have many attractive applications. However, the trade-off between angular and spatial resolution in the imaging process makes light field image super-resolution necessary. In this paper, we propose a Content-Aware Spatial–Angular Interaction (dubbed CASAI) network for light field image super-resolution. The gradient branch of CASAI makes full use of the context information of the low-resolution gradient map and the multi-level features of the super-resolution branch to generate the high-resolution gradient map, which enables the awareness of structure, texture and detail, and provides effective prior knowledge for the super-resolution process. The super-resolution branch of CASAI generates high-quality super-resolution images by making full use of intra-view (i.e., spatial) and inter-view (i.e., angular) information through a spatial–angular adaptive interaction block and using the high-resolution gradient prior as guidance. The spatial–angular adaptive interaction block enables the awareness of the different importance of spatial features and angular features, so as to better integrate intra-view and inter-view information to improve the performance of light field image super-resolution. The experimental results indicate that when the upsampling factor is 4, our method outperforms our baseline (LF-InterNet) with an average PSNR increase of 0.48 dB, while also achieving the best SSIM. Visualization results demonstrate the advantage of our method in simultaneously generating natural SR images and restoring structures.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102782"},"PeriodicalIF":3.7,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141444339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-18DOI: 10.1016/j.displa.2024.102775
Yan Gao , Qingquan Lin , Shuang Ye , Yu Cheng , Tao Zhang , Bin Liang , Weining Lu
{"title":"Outlier detection in temporal and spatial sequences via correlation analysis based on graph neural networks","authors":"Yan Gao , Qingquan Lin , Shuang Ye , Yu Cheng , Tao Zhang , Bin Liang , Weining Lu","doi":"10.1016/j.displa.2024.102775","DOIUrl":"https://doi.org/10.1016/j.displa.2024.102775","url":null,"abstract":"<div><p>Outlier detection is essential for identifying patterns that deviate from expected normal representations in data. Real-world challenges such as the lack of labeled data, noise, and high dimensionality significantly impact the effectiveness of existing methods. Understanding the temporal or spatial correlation of normal data is crucial for detecting or forecasting anomalous patterns. In this paper, we introduce UOSC-GNN, a novel Unsupervised Outlier detection architecture by Sequential Correlation analysis with Graph Neural Network. The architecture includes a deviation generation module to measure the variance between expected and actual states of sequential data. This module incorporates a Generic Feature Extraction component to extract intrinsic features from outliers and normal instances tailored to specific tasks, and an Expected State Estimator component based on Graph Neural Network to learn sequential patterns. To ensure the forecast of outliers with high confidence and improve alarm accuracy, an Outlier Probability Assessment module is introduced. This module combines a rule-based index derived from expert knowledge and a statistical index calculated based on generated deviations. Our method is evaluated on two real-world tasks: medical imaging analysis utilizing spatial-related data and early fault detection of instruments leveraging temporal-related data. The results show that our method triggers alarms about 70 min earlier than the best models on the run-to-failure bearing dataset and achieves an accuracy of 92.82% and a sensitivity of 88.51% on the wireless capsule endoscopy image dataset, outperforming traditional outlier detection algorithms consistently.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102775"},"PeriodicalIF":3.7,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-15DOI: 10.1016/j.displa.2024.102773
Jinlong Xie , Long Cheng , Gang Wang , Min Hu , Zaiyang Yu , Minghua Du , Xin Ning
{"title":"Fusing differentiable rendering and language–image contrastive learning for superior zero-shot point cloud classification","authors":"Jinlong Xie , Long Cheng , Gang Wang , Min Hu , Zaiyang Yu , Minghua Du , Xin Ning","doi":"10.1016/j.displa.2024.102773","DOIUrl":"10.1016/j.displa.2024.102773","url":null,"abstract":"<div><p>Zero-shot point cloud classification involves recognizing categories not encountered during training. Current models often exhibit reduced accuracy on unseen categories without 3D pre-training, emphasizing the need for improved precision and interoperability. We propose a novel approach integrating differentiable rendering with contrastive language–image pre-training. Initially, differentiable rendering autonomously learns representative viewpoints from the data, enabling the transformation of point clouds into multi-view images while preserving key visual information. This transformation facilitates optimized viewpoint selection during training, refining the final feature representation. Features are extracted from the multi-view images and integrated into a global multi-view feature using a cross-attention mechanism. On the textual side, a large language model (LLM) is provided with 3D heuristic prompts to generate 3D-specific text reflecting category-specific traits, from which textual features are derived. The LLM’s extensive pre-trained knowledge enables it to capture abstract notions and categorical features relevant to distinct point cloud categories. Visual and textual features are aligned in a unified embedding space, enabling zero-shot classification. Throughout training, the Structural Similarity Index (SSIM) is integrated into the loss function to encourage the model to discern more distinctive viewpoints, reduce redundancy in multi-view imagery, and enhance computational efficiency. Experimental results on the ModelNet10, ModelNet40, and ScanObjectNN datasets demonstrate classification accuracies of 75.68%, 66.42%, and 52.03%, respectively, surpassing prevailing methods in zero-shot point cloud classification accuracy.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102773"},"PeriodicalIF":3.7,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141401276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DisplaysPub Date : 2024-06-15DOI: 10.1016/j.displa.2024.102778
Qiang Li , Fengbin Rao , Huan Deng , Wenjie Li , Lijun Jiang , Jiafu Lin
{"title":"Retinal projection display with realistic accommodation cue","authors":"Qiang Li , Fengbin Rao , Huan Deng , Wenjie Li , Lijun Jiang , Jiafu Lin","doi":"10.1016/j.displa.2024.102778","DOIUrl":"10.1016/j.displa.2024.102778","url":null,"abstract":"<div><p>Although retinal projection displays have the feature of being always in focus and no loss of image resolution, the lack of accommodation cue makes it difficult to perceive the depth correctly. In this paper, we proposed a retinal projection display system with realistic accommodation cue. The proposed system adds an optometric device to the conventional holographic optical element (HOE)-based retinal projection display. The optometric device detects the accommodation response of the human eyes to observe a real scene at different distances. According to the focused depth of the human eye, refocused images containing realistic focusing and defocusing information are generated through the light field refocusing technology. By establishing a focus depth-lookup table, the refocused images matching with the focused depth of human eye was presented to the human eye, so as to restore the retinal projection display with realistic accommodation cue. A proof-of-concept prototype based on the proposed structure was developed, and the experimental results presented retinal projection displays with realistic accommodation cue at a depth between 250 mm and 400 mm. The proposed system retains the characteristics of high resolution and free focusing while adding the accommodation cue. The developed prototype has great competitiveness in near-eye displays.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102778"},"PeriodicalIF":4.3,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141400075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}