2022 7th International Conference on Image, Vision and Computing (ICIVC)最新文献

筛选
英文 中文
Object Detection in Optical Remote Sensing Images Based on Improved Lightweight Neural Network 基于改进轻量级神经网络的光学遥感图像目标检测
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886739
Zhen Cheng, Jianshe Xiong, PengCheng Yang, Kai Yang, Yunnuo Chen
{"title":"Object Detection in Optical Remote Sensing Images Based on Improved Lightweight Neural Network","authors":"Zhen Cheng, Jianshe Xiong, PengCheng Yang, Kai Yang, Yunnuo Chen","doi":"10.1109/ICIVC55077.2022.9886739","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886739","url":null,"abstract":"The optical remote sensing images collected by Unmanned Aerial Vehicle Remote Sensing (UAVRS) with real-time information, and object detection of the optical remote sensing images has significant development potential in the many fields such as transportation and agriculture. In addition to large objects such as buildings, small objects such as vehicles and ships can also be clearly observed in the collected high-resolution remote sensing images. This paper mainly focuses on the detection of vehicles and ships in remote sensing images, and proposes Scene-SSD based on the main principles of MobileNetV3 and SSD. In this paper, we improve the basic block bottleneck of MobileNetV3, introduce Generalized Focal Loss (GFL) function to replace the original loss function in SSD, improve the class imbalance problem and make the bounding box estimations are more precise, and the network model is trained by transfer learning to improve its generalization ability. It is experimentally illustrated that in object detection of remote sensing images, the Scene-SSD proposed in this paper is fast and the tested mAP can reach 77.9%, which is better than the MobileNetV3-SSDLite with the same network structure in the comparison test.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134276795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Method of Image Recognition with Deep Learning Combined with Attention Mechanism 一种结合注意机制的深度学习图像识别改进方法
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887045
Fang Xiaoyu, Wang Linlin, Liu Chang, Hong Tao
{"title":"An Improved Method of Image Recognition with Deep Learning Combined with Attention Mechanism","authors":"Fang Xiaoyu, Wang Linlin, Liu Chang, Hong Tao","doi":"10.1109/ICIVC55077.2022.9887045","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887045","url":null,"abstract":"An improved convolutional neural network (CNN) recognition model is proposed for the problems involving low recognition rate and weak generalization ability for flower images. Highly abstracted features after multiple convolutions are integrated, and the performance of network is improved by adding the network model for multi-attention mechanism after residual module for Inception-resnet-V2 Network and fully connected layer before activating the function. The improved model is simulated by integrating OxFlowers 17 and Oxford 102 flower data sets. The results show that the recognition rate of the model based on Inception-resnet-V2 Network combined with attention mechanism is up to 97.6%, being 5.1% higher than that of the original model, and the accuracy for flowers recognition is improved significantly.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129511985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Task-Driven Dual-Light Image Fusion and Enhancement Method under Low Illumination 低照度下任务驱动双光图像融合增强方法研究
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886778
Bokun Liu, Junyu Wei, Shaojing Su, Xiaozhong Tong
{"title":"Research on Task-Driven Dual-Light Image Fusion and Enhancement Method under Low Illumination","authors":"Bokun Liu, Junyu Wei, Shaojing Su, Xiaozhong Tong","doi":"10.1109/ICIVC55077.2022.9886778","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886778","url":null,"abstract":"In low light situations, a single visible image can not transmit reliable information, even cause the loss of the target information. At this point, the advantages of visible and infrared image fusion will be highlighted. For a given pair of visible and infrared images, they are collectively referred to as dual-light images in this paper. How to make the most of their information and improve the information expression ability of the fused image is crucial. The traditional evaluation methods use statistical indicators, which is not associated with the upstream task. In this paper, the image fusion method driven by the target detection task is studied. Semantic loss is added to guide the dual-light image fusion. Moreover, through the visual enhancement module, the impact of adverse factors ( low light, etc. ) on the image is weakened, and the information expression level of the image is improved. Thus, the final image is more beneficial to target detection.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"508 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133088775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile Robot Path Planning Based on the Focused Heuristic Algorithm 基于聚焦启发式算法的移动机器人路径规划
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886971
Jia-Ming Lyu, Tian Ma, Wu Zhang, Yukun Yang
{"title":"Mobile Robot Path Planning Based on the Focused Heuristic Algorithm","authors":"Jia-Ming Lyu, Tian Ma, Wu Zhang, Yukun Yang","doi":"10.1109/ICIVC55077.2022.9886971","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886971","url":null,"abstract":"Aiming at the problems of low search efficiency, high search cost, and redundant search range in the traditional D* Lite algorithm in solving the path planning problem, the Focused D* Lite (FDL) algorithm is proposed. The proposed algorithm optimizes and adjusts the node and line respectively. Firstly, based on the current coordinates of mobile robots, the feasibility judgment and information transmission of obstacle information in eight neighborhoods are carried out to enhance the search capability of each step and ensure the effectiveness of the subsequent search. Secondly, the weight assignment is provided for the planned path to improve the concentration of the planned path, so that the algorithm can focus on the key and leading path, reduce the divergence of the algorithm, reduce invalid search and improve the efficiency of the algorithm planning. Simulation results show that the FDL algorithm is more efficient and also could maintain the same level of path quality.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115449540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of Researches on the Emotion Recognition and Affective Computing Based on HCI 基于HCI的情感识别与情感计算研究综述
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886306
Wenqian Lin, Yunjian Zhang
{"title":"Review of Researches on the Emotion Recognition and Affective Computing Based on HCI","authors":"Wenqian Lin, Yunjian Zhang","doi":"10.1109/ICIVC55077.2022.9886306","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886306","url":null,"abstract":"Human-computer interaction (HCI) is the third revolution of information technology after cloud computing and big data. In the design of HCI, it usually involves physical level, cognitive level and emotional level, while emotion recognition and affective computing (ERAC) are the main contents of emotional level. In this paper, the concept and function of ERAC are described; the progress of research on ERAC from facial expression, voice, text, physiological signal and other aspects are analyzed; the application of ERAC in the computer science, health care, media entertainment, intelligent equipment, education and other fields are expound. Finally, in order to provide reference and basis for further research, the problems that need to be studied and future work are prospected.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124198925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared and Visible Image Fusion Based on Biological Vision 基于生物视觉的红外与可见光图像融合
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887132
Qianqian Han, Runping Xi, Qian Chen
{"title":"Infrared and Visible Image Fusion Based on Biological Vision","authors":"Qianqian Han, Runping Xi, Qian Chen","doi":"10.1109/ICIVC55077.2022.9887132","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887132","url":null,"abstract":"Infrared images can acquire salient targets, while visible images contain richer details. It is vital to fuse these two types of images. Benefiting from the existence of the dual-mode cellular mechanism, the rattlesnake is able to process and fusion infrared and visible signals, improving the predatory ability. In this paper, we design an auto-encoder fusion network based on the visual adversarial receptor domain. In this network, we build a feature-level fusion strategy based on the dual-modal cell mechanism which is simulated by the human visual cell’s center-antagonistic receptor domain. Meanwhile, we optimize the feature extraction and feature reconstruction modules in fusion network. By realized the combined research of biological vision and computer vision, our network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121485461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learnable Upsampling-Based Point Cloud Semantic Segmentation 基于可学习上采样的点云语义分割
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886287
Xue Xiang, Wenpeng Zong, Guangyun Li
{"title":"Learnable Upsampling-Based Point Cloud Semantic Segmentation","authors":"Xue Xiang, Wenpeng Zong, Guangyun Li","doi":"10.1109/ICIVC55077.2022.9886287","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886287","url":null,"abstract":"The point cloud semantic segmentation network based on point-wise multi-layer perceptron (MLP) has been widely applied with its end-to-end advantages. Normally, such networks use the traditional upsampling algorithm to recover the details of point clouds in the decoding stage. However, the point cloud has rich 3D geometric information. The traditional interpolation algorithm does not consider the geometric correlation in the process of recovering the details of the point cloud, resulting in the inaccurate output point features. To this end, a learnable upsampling algorithm is proposed in this paper. This upsampling algorithm is implemented by utilizing moving least squares (MLS) and radial basis function (RBF), which can fully exploit the local geometric features of point clouds and accurately restore the details of scenarios. The validity of the proposed upsampling operator is verified on the Semantic3D dataset. Experimental results show that the proposed upsampling algorithm is superior to the widely applied traditional interpolation algorithms when used for point cloud semantic segmentation.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115877305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MITPose: Multi-Granularity Feature Interaction for Human Pose Estimation MITPose:用于人体姿态估计的多粒度特征交互
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887304
Jiayu Zou, Jie Qin, Zhen Zhang, Xingang Wang
{"title":"MITPose: Multi-Granularity Feature Interaction for Human Pose Estimation","authors":"Jiayu Zou, Jie Qin, Zhen Zhang, Xingang Wang","doi":"10.1109/ICIVC55077.2022.9887304","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887304","url":null,"abstract":"Human pose estimation is broadly used in action recognition, Re-Identity, and multi-object tracking. Recently deep convolutional neural networks have demonstrated their great power in human pose estimation. However, CNN-based methods are limited by the constrained receptive field that has poor performance in modeling global relationships of different body parts. In this paper, we propose a novel multi-granularity feature interaction network for human pose estimation (MITPose), which exploits the multi-granularity feature interaction in global-local level features, multi-scale features, and locality features. Our MITPose can efficiently leverage the long-range representation ability of transformer net and inductive locality of convolution net to obtain the comprehensive information for key point localization and relationship modeling. Extensive experiments illustrate that our proposed MITPose achieves state-of-the-art performance on the public COCO dataset.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128343949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robust Approach for Smile Recognition via Deep Convolutional Neural Networks 基于深度卷积神经网络的微笑识别鲁棒方法
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886093
Yuanzhu Liu, Zuoli Liu, Yong Zhao, Junli Xu
{"title":"A Robust Approach for Smile Recognition via Deep Convolutional Neural Networks","authors":"Yuanzhu Liu, Zuoli Liu, Yong Zhao, Junli Xu","doi":"10.1109/ICIVC55077.2022.9886093","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886093","url":null,"abstract":"Smile recognition is a difficult research issue in the fields of computer vision and pattern recognition. Most of existing algorithms are only suitable for western people's smile recognition in simple backgrounds, and cannot well recognize Chinese people's smile in complex backgrounds. In order to solve this problem, we first construct a dataset composed of 4,000 western face images and 4,000 Chinese face images. Especially, 5,000 images in this dataset have complex backgrounds. Then, we use this dataset to train a convolutional neural network, a residual neural network, and a lightweight neural network for smile recognition, respectively. Various experiments show that our algorithm has a good generalization ability to recognize the smile of both western people and Chinese people robustly even in complex backgrounds.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125379626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semi-Supervised Semantic Segmentation of Class-Imbalanced Images: A Hierarchical Self-Attention Generative Adversarial Network 类不平衡图像的半监督语义分割:一种层次自注意生成对抗网络
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886496
Lu Chai, Qinyuan Liu
{"title":"Semi-Supervised Semantic Segmentation of Class-Imbalanced Images: A Hierarchical Self-Attention Generative Adversarial Network","authors":"Lu Chai, Qinyuan Liu","doi":"10.1109/ICIVC55077.2022.9886496","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886496","url":null,"abstract":"How to train models with unlabeled data and implement one trained model across several data sets are key problems in computer vision applications that require high-cost annotations. Recently, a generative model [1] proves its advantages in semi-supervised segmentation and out-of-domain generalization. However, this method becomes less effective when meet with class-imbalanced images whose foreground occupies small areas. To solve this problem, we introduce a hierarchical generative model with a self-attention mechanism to help with capturing features of foreground objects. Concretely, we apply a two-stage hierarchical generative model to perform image synthesis with the self-attention mechanism. Since attention maps are also semantic labels in segmentation fields, the hierarchical self-attention model can synthesize images and corresponding segmentation labels simultaneously. At test time, the segmentation is achieved by mapping input images into latent presentations with two encoders and synthesizing labels with the generative model. We evaluate our hierarchical model on three biomedical segmentation data sets. The experimental results demonstrate that our method outperforms other baselines on semi-supervised segmentation of class-imbalanced images, and meanwhile, pre-serves out-of-domain generalization ability.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117044111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信