2022 7th International Conference on Image, Vision and Computing (ICIVC)最新文献

筛选
英文 中文
DenseATT-Net: Densely-Connected Neural Network with Intensive Attention Modules for 3D ABUS Mass Segmentation denseat - net:具有密集关注模块的密集连接神经网络,用于三维ABUS质量分割
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886080
Hengyu Zhang, Jingxuan Xu, Mengyu Wang, Yanfeng Li
{"title":"DenseATT-Net: Densely-Connected Neural Network with Intensive Attention Modules for 3D ABUS Mass Segmentation","authors":"Hengyu Zhang, Jingxuan Xu, Mengyu Wang, Yanfeng Li","doi":"10.1109/ICIVC55077.2022.9886080","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886080","url":null,"abstract":"Accurate segmentation of breast mass in 3D automated breast ultrasound (ABUS) images is important in breast cancer analysis. However, it is hard to obtain enough labeled ABUS images for training segmentation networks, which may lead to over-fitting problem in deep learning based methods. Aiming at this problem, a lightweight segmentation network D2U-Net is selected as the baseline. ABUS images have a low signal-to-noise ratio and serious artifacts, which makes mass boundary unclear. To address this problem, different kinds of attention modules are inserted into the segmentation network. These attention modules include spatial attention, channel attention, convolutional block attention module (CBAM) and squeeze-and-excitation (SE) block. The whole segmentation network is termed as DenseATT-Net. An ABUS dataset with 170 volumes is employed to verify the segmentation performance. Experimental results show that the proposed method performs better than other segmentation models on 3D ABUS images.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115439817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-cooperative Space Target High-Speed Tracking Measuring Method Based on FPGA 基于FPGA的非合作空间目标高速跟踪测量方法
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887187
Kailiang Han, Haodong Pei, Zhentao Huang, Tao Huang, Shangshi Qin
{"title":"Non-cooperative Space Target High-Speed Tracking Measuring Method Based on FPGA","authors":"Kailiang Han, Haodong Pei, Zhentao Huang, Tao Huang, Shangshi Qin","doi":"10.1109/ICIVC55077.2022.9887187","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887187","url":null,"abstract":"Based on the analysis of visible image features of space targets, a high-speed tracking and measurement method is proposed for space non-cooperative targets in a starry background. The interference of background stars on the real target detection results is excluded by using algorithms such as clustering analysis of target-oriented graphical features, and the tracking detection of the target is completed by the shape center extraction algorithm. The algorithm is finally applied to an FPGA-based space embedded system. Finally, the software operation efficiency is improved by optimizing the clustering algorithm and setting the region of interest, and the parallel processing capability of FPGA is used to realize the processing of image data while reading. The software image processing speed is tested to reach 50Hz for targets with imaging scales between 3x3 pixels and 500x500 pixels. Currently, the design has been applied to a practical system with good application results.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123182233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Action Recognition Based on Three-Stream Network with Frame Sequence Features 基于帧序列特征的三流网络人体动作识别
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887162
Ruifeng Huang, Chong Chen, Rui Cheng, Y. Zhang, Jiabing Zhu
{"title":"Human Action Recognition Based on Three-Stream Network with Frame Sequence Features","authors":"Ruifeng Huang, Chong Chen, Rui Cheng, Y. Zhang, Jiabing Zhu","doi":"10.1109/ICIVC55077.2022.9887162","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887162","url":null,"abstract":"In the field of human action recognition (HAR), two-stream models have been widely employed. In recent years, traditional two-stream network models have disregarded the interframe sequence characteristics of video, resulting in a decrease in model robustness when local sequence information and long-term motion information interact. In light of this, a novel three-stream neural network is proposed by combining the long-term and short-term characteristics of a frame sequence with spatio-temporal information. Initially, the optical flow sequence image frames and RGB image frames in the video are extracted, the optical flow motion information and image space information in the video is obtained, the corresponding time network and space network are entered, and the spatial information is entered into the sequence feature processing network; the three networks are then pretrained. At the conclusion of training, the operation of feature extraction is executed, the features are incorporated with the parallel fusion algorithm by adding weights, and the behavior categories are classified using Multi-Layer Perception. Experimental results on the UCF11, UCF50, and HMDB51 datasets demonstrate that our model effectively integrates the spatial-temporal and frame-sequence information of human actions, resulting in a significant improvement in recognition accuracy. Its classification accuracy on the three datasets was 99.17%, 97.40%, and 96.88%, respectively, significantly enhancing the generalization capability and validity of conventional two-stream or three-stream models.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128364103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Similarity Measurement Human Actions with GNN 用GNN测量人类行为的相似性
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887189
Xiuxiu Li, Pu Zhang, Chaoxian Wang, Shengjun Wu
{"title":"Similarity Measurement Human Actions with GNN","authors":"Xiuxiu Li, Pu Zhang, Chaoxian Wang, Shengjun Wu","doi":"10.1109/ICIVC55077.2022.9887189","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887189","url":null,"abstract":"Measuring the similarity of human actions represented with human skeletons from motion capture plays an important role in classification, retrieval and analysis of actions. In this paper, a similarity measurement method of human action based on graph neural network is proposed. In this method, due to the introduction of graph convolution neural network, the dependence between adjacent joints in human skeleton can be obtained, which makes the expression of human action in a frame more accurate. In the further action similarity measurement, LSTM with self- attention is used to extract temporal feature of the human action sequence, and finally MMD-NCA is used to measure the similarity of action sequences. Experiments on public dataset verify the effectiveness of this method in action recognition and action similarity measurement.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"48 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120908042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Encoder-Decoder Network with Residual and Attention Blocks for Full-Face 3D Gaze Estimation 基于残差和注意块的全脸三维凝视估计编码器-解码器网络
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886734
Xinyuan Song, Shaoxiang Guo, Zhenfu Yu, Junyu Dong
{"title":"An Encoder-Decoder Network with Residual and Attention Blocks for Full-Face 3D Gaze Estimation","authors":"Xinyuan Song, Shaoxiang Guo, Zhenfu Yu, Junyu Dong","doi":"10.1109/ICIVC55077.2022.9886734","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886734","url":null,"abstract":"This paper proposes a novel end-to-end network to improve the accuracy of gaze estimation task with full-face image as input. We first explored the possibility of using the encoder-decoder network to reconstruct the input face image, then we used U-Net with residual blocks to retain eyes features hidden in high resolution feature map layers, which are often lost during down-sampling and convolution layers. Finally, we applied spatial and channel-wise attention blocks to our model to better consider the relations among different regions globally and enhance the contribution of valuable gaze-related regions. We conducted experiments on the ETH-XGaze dataset. The results turned out that our proposed model is very competitive compared with existing state-of-the-art methods for person-independent gaze estimation.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"15 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120928903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improved YOLOv5 with Transformer for Large Scene Military Vehicle Detection on SAR Image 改进的带变压器的YOLOv5基于SAR图像的大场景军用车辆检测
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887095
Yi Sun, Wenna Wang, Qianyu Zhang, Han Ni, Xiuwei Zhang
{"title":"Improved YOLOv5 with Transformer for Large Scene Military Vehicle Detection on SAR Image","authors":"Yi Sun, Wenna Wang, Qianyu Zhang, Han Ni, Xiuwei Zhang","doi":"10.1109/ICIVC55077.2022.9887095","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887095","url":null,"abstract":"With the development of SAR technology, large scene object detection on SAR images has attracted more and more attention. Exiting large scene object detection is mainly based on the CNN network, which limits the obtaining of global context information. On the other hand, due to the high acquisition cost of SAR images, there are no existing public datasets in military vehicle detection. To solve these problems, we adopt the Transformer module to construct the neck block based on YOLOv5. This design can gain global context information, and also has better performance for small objects detection. Furthermore, to achieve the detection of large-scale military ground vehicles, we construct a dataset based on the MSTAR dataset, named LSGVOD. Extensive experiments have been conducted on LSGVOD, and experimental results show that the proposed method greatly improves detection accuracy. Compared to other methods, it achieves the best accuracy with 93.3% mAP.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121209558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Robust Facial Expression Recognition Based on Dual Branch Multi-feature Learning 基于双分支多特征学习的鲁棒面部表情识别
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886565
Xuewen Liu, Zhe Guo, Boya Yuan, Haojie Guo
{"title":"Robust Facial Expression Recognition Based on Dual Branch Multi-feature Learning","authors":"Xuewen Liu, Zhe Guo, Boya Yuan, Haojie Guo","doi":"10.1109/ICIVC55077.2022.9886565","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886565","url":null,"abstract":"Facial expression recognition (FER) is a key factor in human behavior analysis. Most algorithms are difficult to distinguish the subtle differences of local facial features, such as facial wrinkles and mouth corners. To solve the above problems, We propose the Dual Branch Multi-feature Learning Network (DBML-Net) to explore the latent incentive. It contains two branchs. One branch works to extract apparent features from the original images, the other uses two texture features (CS-LOP and ALDP) to enhance the detailed information. A Densely Connected Dynamic Selective Kernel Network (Dense-SK) is constructed as the feature extraction section of branch one. The extensive experimental results show that the DBML-Net achieves state-of-the-art performance on three widely used FER datasets: CK+, Oulu-CASIA and JAFFE, which demonstrate the effectiveness of our method.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122746477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Fast Extraction of Information System from Online Social Network Images Based on Big Data Algorithm 基于大数据算法的在线社交网络图像信息系统快速提取研究
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886207
R. Qi, Yuanlong Chen, Xiaojiang Sun, Si Qingaowa, Shenghui Chen
{"title":"Research on Fast Extraction of Information System from Online Social Network Images Based on Big Data Algorithm","authors":"R. Qi, Yuanlong Chen, Xiaojiang Sun, Si Qingaowa, Shenghui Chen","doi":"10.1109/ICIVC55077.2022.9886207","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886207","url":null,"abstract":"The paper proposes a social network image tag sorting big data algorithm, which combines SIFT features, convolutional neural network features, and visual bag-of-words model to obtain the target image's visual neighbour image set from the image training set. The paper makes all the visual neighbour images the initial label of the target image for weighted voting and calculates the voting weight through the linear fusion of visual image similarity and label semantic similarity. Simultaneously, the paper uses the target image labels and their visual neighbors to construct a label graph model and uses the weighted voting results to perform a random walk on the label graph to complete the label ranking task. The experimental results verify the effectiveness of the proposed method.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126736417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Method of Sound Event Localization and Detection Based on Three-Dimension Convolution 基于三维卷积的声音事件定位与检测方法
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886722
Pengcheng Mei, Jibin Yang, Qiang Zhang, Xian Huang
{"title":"A Method of Sound Event Localization and Detection Based on Three-Dimension Convolution","authors":"Pengcheng Mei, Jibin Yang, Qiang Zhang, Xian Huang","doi":"10.1109/ICIVC55077.2022.9886722","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886722","url":null,"abstract":"Deep Learning methods represented by convolutional neural networks can jointly realize Sound Event Detection (SED) and Sound Source Location (SSL). However, due to the noise and reverberation in real scenes, the accuracy of direction estimation is still dissatisfactory. Since three-dimensional convolution can carry out convolution calculation in time, frequency and channel domains for multichannel input simultaneously, it can learn more inter-channel and intra-channel features and effectively solve the above problems compared to two-dimensional convolution. Inspired by it, a method based on three-dimension convolution feature extraction called SELD3Dnet is proposed. The amplitude and phase characteristics of input multi-channel audio are calculated, and the deep feature representation is extracted through multiple 3D convolutional structures. Finally, the category and spatial location of sound events are estimated by recurrent neural networks and fully connection layers. Comparative experiments are conducted on TUT2018 datasets, and the results show that the proposed method improves the F1 metric by 13.9% and the frame recall metric by 21.1% on average under various types of real scene data subset ov1, ov2, ov3, which can validate the performance of the proposed method.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125459489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stroke Based Shadow Generation For Line Drawings 基于笔画的阴影生成线图纸
2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887289
Huanhuan Xue, Chunmeng Kang
{"title":"Stroke Based Shadow Generation For Line Drawings","authors":"Huanhuan Xue, Chunmeng Kang","doi":"10.1109/ICIVC55077.2022.9887289","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887289","url":null,"abstract":"We present a method to generate stylized shadows for line drawings. To begin with, we disturb the RGB values of the image in a small range, and propose a new calculation method to estimate the stroke density of the disturbed image. Then the light effect map is generated based on the wave function which is combined with the original image to produce a shadow effect for the original image. We use image enhancement techniques to improve the quality of the shadows and enhance the subjective visual effect. Our algorithm adapts to the image structure, and simplifies the user’s workflow, reduces the user’s workload, and saves time when drawing image shadows. Abundant experiments prove that our method solves the difficulty of adding light and shadow to line drawings using stroke density.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131895546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信