2022 7th International Conference on Image, Vision and Computing (ICIVC)最新文献_第9页

DenseATT-Net: Densely-Connected Neural Network with Intensive Attention Modules for 3D ABUS Mass Segmentation denseat - net:具有密集关注模块的密集连接神经网络，用于三维ABUS质量分割

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886080

Hengyu Zhang, Jingxuan Xu, Mengyu Wang, Yanfeng Li

引用次数: 0

Non-cooperative Space Target High-Speed Tracking Measuring Method Based on FPGA 基于FPGA的非合作空间目标高速跟踪测量方法

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887187

Kailiang Han, Haodong Pei, Zhentao Huang, Tao Huang, Shangshi Qin

引用次数: 0

Human Action Recognition Based on Three-Stream Network with Frame Sequence Features 基于帧序列特征的三流网络人体动作识别

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887162

Ruifeng Huang, Chong Chen, Rui Cheng, Y. Zhang, Jiabing Zhu

{"title":"Human Action Recognition Based on Three-Stream Network with Frame Sequence Features","authors":"Ruifeng Huang, Chong Chen, Rui Cheng, Y. Zhang, Jiabing Zhu","doi":"10.1109/ICIVC55077.2022.9887162","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9887162","url":null,"abstract":"In the field of human action recognition (HAR), two-stream models have been widely employed. In recent years, traditional two-stream network models have disregarded the interframe sequence characteristics of video, resulting in a decrease in model robustness when local sequence information and long-term motion information interact. In light of this, a novel three-stream neural network is proposed by combining the long-term and short-term characteristics of a frame sequence with spatio-temporal information. Initially, the optical flow sequence image frames and RGB image frames in the video are extracted, the optical flow motion information and image space information in the video is obtained, the corresponding time network and space network are entered, and the spatial information is entered into the sequence feature processing network; the three networks are then pretrained. At the conclusion of training, the operation of feature extraction is executed, the features are incorporated with the parallel fusion algorithm by adding weights, and the behavior categories are classified using Multi-Layer Perception. Experimental results on the UCF11, UCF50, and HMDB51 datasets demonstrate that our model effectively integrates the spatial-temporal and frame-sequence information of human actions, resulting in a significant improvement in recognition accuracy. Its classification accuracy on the three datasets was 99.17%, 97.40%, and 96.88%, respectively, significantly enhancing the generalization capability and validity of conventional two-stream or three-stream models.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128364103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Similarity Measurement Human Actions with GNN 用GNN测量人类行为的相似性

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887189

Xiuxiu Li, Pu Zhang, Chaoxian Wang, Shengjun Wu

引用次数: 0

An Encoder-Decoder Network with Residual and Attention Blocks for Full-Face 3D Gaze Estimation 基于残差和注意块的全脸三维凝视估计编码器-解码器网络

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886734

Xinyuan Song, Shaoxiang Guo, Zhenfu Yu, Junyu Dong

引用次数: 2

Improved YOLOv5 with Transformer for Large Scene Military Vehicle Detection on SAR Image 改进的带变压器的YOLOv5基于SAR图像的大场景军用车辆检测

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887095

Yi Sun, Wenna Wang, Qianyu Zhang, Han Ni, Xiuwei Zhang

引用次数: 6

Robust Facial Expression Recognition Based on Dual Branch Multi-feature Learning 基于双分支多特征学习的鲁棒面部表情识别

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886565

Xuewen Liu, Zhe Guo, Boya Yuan, Haojie Guo

引用次数: 0

Research on Fast Extraction of Information System from Online Social Network Images Based on Big Data Algorithm 基于大数据算法的在线社交网络图像信息系统快速提取研究

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886207

R. Qi, Yuanlong Chen, Xiaojiang Sun, Si Qingaowa, Shenghui Chen

引用次数: 0

A Method of Sound Event Localization and Detection Based on Three-Dimension Convolution 基于三维卷积的声音事件定位与检测方法

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9886722

Pengcheng Mei, Jibin Yang, Qiang Zhang, Xian Huang

{"title":"A Method of Sound Event Localization and Detection Based on Three-Dimension Convolution","authors":"Pengcheng Mei, Jibin Yang, Qiang Zhang, Xian Huang","doi":"10.1109/ICIVC55077.2022.9886722","DOIUrl":"https://doi.org/10.1109/ICIVC55077.2022.9886722","url":null,"abstract":"Deep Learning methods represented by convolutional neural networks can jointly realize Sound Event Detection (SED) and Sound Source Location (SSL). However, due to the noise and reverberation in real scenes, the accuracy of direction estimation is still dissatisfactory. Since three-dimensional convolution can carry out convolution calculation in time, frequency and channel domains for multichannel input simultaneously, it can learn more inter-channel and intra-channel features and effectively solve the above problems compared to two-dimensional convolution. Inspired by it, a method based on three-dimension convolution feature extraction called SELD3Dnet is proposed. The amplitude and phase characteristics of input multi-channel audio are calculated, and the deep feature representation is extracted through multiple 3D convolutional structures. Finally, the category and spatial location of sound events are estimated by recurrent neural networks and fully connection layers. Comparative experiments are conducted on TUT2018 datasets, and the results show that the proposed method improves the F1 metric by 13.9% and the frame recall metric by 21.1% on average under various types of real scene data subset ov1, ov2, ov3, which can validate the performance of the proposed method.","PeriodicalId":227073,"journal":{"name":"2022 7th International Conference on Image, Vision and Computing (ICIVC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125459489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stroke Based Shadow Generation For Line Drawings 基于笔画的阴影生成线图纸

2022 7th International Conference on Image, Vision and Computing (ICIVC) Pub Date : 2022-07-26 DOI: 10.1109/ICIVC55077.2022.9887289

Huanhuan Xue, Chunmeng Kang

引用次数: 0