Journal of Electronic Imaging最新文献

筛选
英文 中文
Improved self-supervised learning for disease identification in chest X-ray images 改进自我监督学习,识别胸部 X 光图像中的疾病
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043006
Yongjun Ma, Shi Dong, Yuchao Jiang
{"title":"Improved self-supervised learning for disease identification in chest X-ray images","authors":"Yongjun Ma, Shi Dong, Yuchao Jiang","doi":"10.1117/1.jei.33.4.043006","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043006","url":null,"abstract":"The utilization of chest X-ray (CXR) image data analysis for assisting in disease diagnosis is an important application of artificial intelligence. Supervised learning faces challenges due to a lack of large-scale labeled datasets and inaccuracies. Self-supervised learning offers a potential solution, but current research in this area is limited, and the diagnostic accuracy remains unsatisfactory. We propose an approach that integrates the self-supervised Bidirectional Encoder Representations from Image Transformers version 2 (BEiTv2) method with the vector quantization-based knowledge distillation (VQ-KD) strategy into CXR image data to enhance disease diagnosis accuracy. Our methodology demonstrates superior performance compared with existing self-supervised methods, showcasing its efficacy in improving diagnostic outcomes. Through transfer and ablation studies, we elucidate the benefits of the VQ-KD strategy in enhancing model performance and transferability to downstream tasks.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"34 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep degradation-aware up-sampling-based depth video coding 基于深度降级感知的上采样深度视频编码
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043009
Zhaoqing Pan, Yuqing Niu, Bo Peng, Ge Li, Sam Kwong, Jianjun Lei
{"title":"Deep degradation-aware up-sampling-based depth video coding","authors":"Zhaoqing Pan, Yuqing Niu, Bo Peng, Ge Li, Sam Kwong, Jianjun Lei","doi":"10.1117/1.jei.33.4.043009","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043009","url":null,"abstract":"The smooth regions in depth videos contain a significant proportion of homogeneous content, resulting in many spatial redundancies. To improve the coding efficiency of depth videos, this paper proposes a deep degradation-aware up-sampling-based depth video coding method. For reducing spatial redundancies effectively, the proposed method compresses the depth video at a low resolution, and restores the resolution by utilizing the learning-based up-sampling technology. To recover high-quality depth videos, a degradation-aware up-sampling network is proposed, which explores the degradation information of compression artifacts and sampling artifacts to restore the resolution. Specifically, the compression artifact removal module is used to obtain refined low-resolution depth frames by learning the representation of compression artifacts. Meanwhile, a jointly optimized learning strategy is designed to enhance the capability of recovering high-frequency details, which is beneficial for up-sampling. According to the experimental results, the proposed method achieves considerable performance in depth video coding compared with 3D-HEVC.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"28 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pose-guided node and trajectory construction transformer for occluded person re-identification 用于模糊人物再识别的姿态引导节点和轨迹构建转换器
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043021
Chentao Hu, Yanbing Chen, Lingyi Guo, Lingbing Tao, Zhixin Tie, Wei Ke
{"title":"Pose-guided node and trajectory construction transformer for occluded person re-identification","authors":"Chentao Hu, Yanbing Chen, Lingyi Guo, Lingbing Tao, Zhixin Tie, Wei Ke","doi":"10.1117/1.jei.33.4.043021","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043021","url":null,"abstract":"Occluded person re-identification (re-id) is a task in pedestrian retrieval where occluded person images are matched with holistic person images. Most methods leverage semantic cues from external models to align the availability of visible parts in the feature space. However, presenting visible parts while discarding occluded parts can lead to the loss of semantics in the occluded regions, and in severely crowded regions of occlusion, it will introduce inaccurate features that pollute the overall person features. Thus, constructing person features for occluded regions based on the features of its holistic parts has the potential to address the above issues. In this work, we propose a pose-guided node and trajectory construction transformer (PNTCT). The part feature extraction module extracts parts feature of the person and incorporates pose information to activate key visible local features. However, this is not sufficient to completely separate occluded regions. To further distinguish visible and occluded parts, the skeleton graph module adopts a graph topology to represent local features as graph nodes, enhancing the network’s sensitivity to local features by constructing a skeleton feature graph, which is further utilized to weaken the occlusion noise. The node and trajectory construction module (NTC) mines the relationships between skeleton nodes and aggregates the information of the person’s skeleton to construct a novel skeleton graph. The features of the occluded regions can be reconstructed via the features of the corresponding nodes in the novel skeleton graph. Extensive experiments and analyses confirm the effectiveness and superiority of our PNTCT method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"45 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional human pose estimation based on contact pressure 基于接触压力的三维人体姿态估计
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043022
Ning Yin, Ke Wang, Nian Wang, Jun Tang, Wenxia Bao
{"title":"Three-dimensional human pose estimation based on contact pressure","authors":"Ning Yin, Ke Wang, Nian Wang, Jun Tang, Wenxia Bao","doi":"10.1117/1.jei.33.4.043022","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043022","url":null,"abstract":"Various daily behaviors usually exert pressure on the contact surface, such as lying, walking, and sitting. Obviously, the pressure data from the contact surface contain some important biological information for an individual. Recently, a computer vision task, i.e., pose estimation from contact pressure (PECP), has received more and more attention from researchers. Although several deep learning-based methods have been put forward in this field, they cannot achieve accurate prediction using the limited pressure information. To address this issue, we present a multi-task-based PECP model. Specifically, the autoencoder is introduced into our model for reconstructing input pressure data (i.e., the additional task), which can help our model generate high-quality features for the pressure data. Moreover, both the mean squared error and the spectral angle distance are adopted to construct the final loss function, whose aim is to eliminate the Euclidean distance and angle differences between the prediction and ground truth. Extensive experiments on the public dataset show that our method outperforms existing methods significantly in pose prediction from contact pressure.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"14 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SDANet: scale-deformation awareness network for crowd counting SDANet:用于人群计数的规模变形感知网络
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043002
Jianyong Wang, Xiangyu Guo, Qilei Li, Ahmed M. Abdelmoniem, Mingliang Gao
{"title":"SDANet: scale-deformation awareness network for crowd counting","authors":"Jianyong Wang, Xiangyu Guo, Qilei Li, Ahmed M. Abdelmoniem, Mingliang Gao","doi":"10.1117/1.jei.33.4.043002","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043002","url":null,"abstract":"Crowd counting aims to derive information about crowd density by quantifying the number of individuals in an image or video. It offers crucial insights applicable to various domains, e.g., secure, efficient decision-making, and management. However, scale variation and irregular shapes of heads pose intricate challenges. To address these challenges, we propose a scale-deformation awareness network (SDANet). Specifically, a scale awareness module is introduced to address the scale variation. It can capture long-distance dependencies and preserve precise spatial information by readjusting weights in height and width directions. Concurrently, a deformation awareness module is introduced to solve the challenge of head deformation. It adjusts the sampling position of the convolution kernel through deformable convolution and learning offset. Experimental results on four crowd-counting datasets prove the superiority of SDANet in accuracy, efficiency, and robustness.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"132 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight human activity recognition system for resource constrained environments 用于资源有限环境的轻量级人类活动识别系统
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043025
Mihir Karandikar, Ankit Jain, Abhishek Srivastava
{"title":"Lightweight human activity recognition system for resource constrained environments","authors":"Mihir Karandikar, Ankit Jain, Abhishek Srivastava","doi":"10.1117/1.jei.33.4.043025","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043025","url":null,"abstract":"As the elderly population in need of assisted living arrangements continues to grow, the imperative to ensure their safety is paramount. Though effective, traditional surveillance methods, notably RGB cameras, raise significant privacy concerns. This paper highlights the advantages of a surveillance system addressing these issues by utilizing skeleton joint sequences extracted from depth data. The focus on non-intrusive parameters aims to mitigate ethical and privacy concerns. Moreover, the proposed work prioritizes resource efficiency, acknowledging the often limited computing resources in assisted living environments. We strive for a method that can run efficiently even in the most resource-constrained environments. Performance evaluation and a prototypical implementation of our method on a resource-constraint device confirm the efficacy and suitability of the proposed method in real-world applications.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"65 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141754003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on ground-based cloud image classification combining local and global features 结合局部和全局特征的地面云图像分类研究
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043030
Xin Zhang, Wanting Zheng, Jianwei Zhang, Weibin Chen, Liangliang Chen
{"title":"Research on ground-based cloud image classification combining local and global features","authors":"Xin Zhang, Wanting Zheng, Jianwei Zhang, Weibin Chen, Liangliang Chen","doi":"10.1117/1.jei.33.4.043030","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043030","url":null,"abstract":"Clouds are an important factor in predicting future weather changes. Cloud image classification is one of the basic issues in the field of ground-based cloud meteorological observation. Deep CNN mainly focuses on the local receptive field, and the processing of global information may be relatively weak. In ground-based cloud image classification, if there is a complex background, it will help to better model the long-range dependence of the image if the relationship between different locations in the image can be globally captured. A ground-based cloud image classification method is proposed based on the fusion of local features and global features (LG_CloudNet). The ground-based cloud image classification method integrates the global feature extraction module (GF_M) and the local feature extraction module (LF_M), using the attention mechanism to weight and merge features, respectively. The LG_CloudNet model enables richer and comprehensive feature representation at lower computational complexity. In order to ensure the learning and generalization capabilities of the model during training, AdamW (Adam weight decay) is combined with learning rate warm-up and stochastic gradient descent with warm restarts methods to adjust the learning rate. The experimental results demonstrate that the proposed method achieves favorable ground-based cloud image classification outcomes and exhibits robust performance in classifying cloud images. In the datasets of GCD, CCSN, and ZNCL, the classification accuracy is 94.94%, 95.77%, and 98.87%, respectively.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"26 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radar spectrum-image fusion using dual 2D-3D convolutional neural network to transformer inspired multi-headed self-attention bi-long short-term memory network for vehicle recognition 利用双 2D-3D 卷积神经网络与变压器启发的多头自注意双长短期记忆网络进行雷达频谱-图像融合,以识别车辆
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043010
Ferris I. Arnous, Ram M. Narayanan
{"title":"Radar spectrum-image fusion using dual 2D-3D convolutional neural network to transformer inspired multi-headed self-attention bi-long short-term memory network for vehicle recognition","authors":"Ferris I. Arnous, Ram M. Narayanan","doi":"10.1117/1.jei.33.4.043010","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043010","url":null,"abstract":"Radar imaging techniques, such as synthetic aperture radar, are widely explored in automatic vehicle recognition algorithms for remote sensing tasks. A large basis of literature covering several machine learning methodologies using visual information transformers, self-attention, convolutional neural networks (CNN), long short-term memory (LSTM), CNN-LSTM, CNN-attention-LSTM, and CNN Bi-LSTM models for detection of military vehicles have been attributed with high performance using a combination of these approaches. Tradeoffs between differing number of poses, single/multiple feature extraction streams, use of signals and/or images, as well as the specific mechanisms used to combine them, have widely been debated. We propose the adaptation of several models towards a unique biologically inspired architecture that utilizes both multi-pose and multi-contextual image and signal radar sensor information to make vehicle assessments over time. We implement a compact multi-pose 3D CNN single stream to process and fuse multi-temporal images while a dual sister 2D CNN stream processes the same information over a lower-dimensional power-spectral domain to mimic the way multi-sequence visual imagery is combined with auditory feedback for enhanced situational awareness. These data are then fused across data domains using transformer-modified encoding blocks to Bi-LSTM segments. Classification results on a fundamentally controlled simulated dataset yielded accuracies of up to 98% and 99% in line with literature. This enhanced performance was then evaluated for robustness not previously explored for three simultaneous parameterizations of incidence angle, object orientation, and lowered signal-to-noise ratio values and found to increase recognition on all three cases for low to moderate noised environments.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"66 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Underwater object detection by integrating YOLOv8 and efficient transformer 通过集成 YOLOv8 和高效变压器进行水下物体探测
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043011
Jing Liu, Kaiqiong Sun, Xiao Ye, Yaokun Yun
{"title":"Underwater object detection by integrating YOLOv8 and efficient transformer","authors":"Jing Liu, Kaiqiong Sun, Xiao Ye, Yaokun Yun","doi":"10.1117/1.jei.33.4.043011","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043011","url":null,"abstract":"In recent years, underwater target detection algorithms based on deep learning have greatly promoted the development of the field of marine science and underwater robotics. However, due to the complexity of the underwater environment, there are problems, such as target occlusion, overlap, background confusion, and small object, that lead to detection difficulties. To address this issue, this paper proposes an improved underwater target detection method based on YOLOv8s. First, a lightweight backbone network with efficient transformers is used to replace the original backbone network, which enhances the contextual feature extraction capability. Second, an improved bidirectional feature pyramid network is used in the later multi-scale fusion part by increasing the input of bottom-level information while reducing the model size and number of parameters. Finally, a dynamic head with an attention mechanism is introduced into the detection head to enhance the classification and localization of small and fuzzy targets. Experimental results show that the proposed method improves the mAP0.5:0.95 of 65.7%, 63.7%, and 51.2% with YOLOv8s to that of 69.2%, 66.8%, and 54.8%, on three public underwater datasets, DUO, RUOD, and URPC2020, respectively. Additionally, compared with the YOLOv8s model, the model size decreased from 21.46 to 15.56 MB, and the number of parameters decreased from 11.1 to 7.9 M.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"36 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scale separation: video crowd counting with different density maps 规模分离:使用不同密度图进行视频人群计数
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043016
Ao Zhang, Xin Deng, Baoying Liu, Weiwei Zhang, Jun Guo, Linrui Xie
{"title":"Scale separation: video crowd counting with different density maps","authors":"Ao Zhang, Xin Deng, Baoying Liu, Weiwei Zhang, Jun Guo, Linrui Xie","doi":"10.1117/1.jei.33.4.043016","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043016","url":null,"abstract":"Most crowd counting methods rely on integrating density maps for prediction, but they encounter performance degradation in the face of density variations. Existing methods primarily employ a multi-scale architecture to mitigate this issue. However, few approaches concurrently consider both scale and timing information. We propose a scale-divided architecture for video crowd counting. Initially, density maps of different Gaussian scales are employed to retain information at various scales, accommodating scale changes in images. Subsequently, we observe that the spatiotemporal network places greater emphasis on individual locations, prompting us to aggregate temporal information at a specific scale. This design enables the temporal model to acquire more spatial information and alleviate occlusion issues. Experimental results on various public datasets demonstrate the superior performance of our proposed method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"69 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信