Proceedings of the 5th International Conference on Control and Computer Vision最新文献

筛选
英文 中文
Visual Comfort Classification for Stereoscopic Videos Based on Two-Stream Recurrent Neural Network with Multi-level Attention 基于多级关注双流递归神经网络的立体视频视觉舒适度分类
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561628
Weize Gan, Danhong Peng, Yuzhen Niu
{"title":"Visual Comfort Classification for Stereoscopic Videos Based on Two-Stream Recurrent Neural Network with Multi-level Attention","authors":"Weize Gan, Danhong Peng, Yuzhen Niu","doi":"10.1145/3561613.3561628","DOIUrl":"https://doi.org/10.1145/3561613.3561628","url":null,"abstract":"Due to the differences in visual systems between children and adults, a professional stereoscopic 3D video may not be comfortable for children. In this paper, we aim to answer whether a stereoscopic video is comfortable for children to watch by solving the visual comfort classification for stereoscopic videos. In particular, we propose a two-stream recurrent neural network (RNN) with multi-level attention for the visual comfort classification for stereoscopic videos. Firstly, we propose a two-stream RNN to extract and fuse spatial and temporal features from video frames and disparity maps. Furthermore, we propose using multi-level attention to effectively enhance the features in frame level, shot level, and finally video level. In addition, to our best knowledge, we establish the first high-definition stereoscopic 3D video dataset for performance evaluation. Experimental results show that our proposed model can effectively classify professional stereoscopic videos into visually comfortable for children or adults only.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132583095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Fusion: Graph Attention Network and CNN Combing for Hyperspectral Image Classification 特征融合:图注意网络与CNN结合的高光谱图像分类
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561640
Qikun Pan, Xiaoxi Xu, Qi Chang, Chundi Pan, Guo Cao
{"title":"Feature Fusion: Graph Attention Network and CNN Combing for Hyperspectral Image Classification","authors":"Qikun Pan, Xiaoxi Xu, Qi Chang, Chundi Pan, Guo Cao","doi":"10.1145/3561613.3561640","DOIUrl":"https://doi.org/10.1145/3561613.3561640","url":null,"abstract":"Graph convolutional networks (GCNs) have attracted increasing attention in hyperspectral image classification. However, most of the available GCN-based HSI classification methods treat superpixels as graph nodes, ignoring pixel-level spectral spatial features. In this paper, we propose a novel Feature Fusion Network (FFGCN), which is composed of two different convolutional networks, namely Graph Attention Network (GAT) and Convolutional Neural Network (CNN). Among them, superpixel-based GAT can deal with the problem of labeled deficiency and extract spatial features from HSI. Attention-based multi-scale CNN can extract multi-scale pixel local features for HSI classification. Finally, the features of the two neural network models are fused and used for classification. Rigorous experiments on two real HSI datasets show that FFGCN achieves better experimental results and is competitive with other state-of-the-art methods.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"24 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114030382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-Time Robust Single Sperm Tracking via Adaptive Particle Filtering 基于自适应粒子滤波的实时鲁棒单精子跟踪
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561638
Fengling Meng, Yinran Chen, Xióngbiao Luó
{"title":"Real-Time Robust Single Sperm Tracking via Adaptive Particle Filtering","authors":"Fengling Meng, Yinran Chen, Xióngbiao Luó","doi":"10.1145/3561613.3561638","DOIUrl":"https://doi.org/10.1145/3561613.3561638","url":null,"abstract":"Assisted reproductive technology is commonly used to treat infertility. Motility-based selection of high-quality sperms is the key to improve the successful rate of artificial assisted reproduction. Visually tracking the sperms on optical microscopic video frames is essential to evaluate their motility before the selection. Unfortunately, current methods easily fail to precisely track the sperms in real time. This work is to accurately and robustly detect and track single sperm based on microscopic video frames. We propose a modified background subtraction method to detect multiple sperms in successive frames. We also introduce an adaptive particle filtering method to accurately and robustly track the trajectory of a single sperm in real time. Specifically, this method models the sperm movement by comparing its histogram information at different positions on microscopic images and uses adaptive particle filtering to approximate the optimal state of the sperm. The experimental results demonstrate that our method can achieve much better tracking accuracy than other visual tracking methods, providing more reliable sperm motility analysis. In particular, our method can successfully re-track the same sperm when it appears again on the microscopic focal plane after disappearing in a few frames, while the other compared tracking methods usually fail to re-track the same sperm after its back.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121291802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
“Presence” and “Empathy” — Design and Implementation Emotional Interactive Storytelling for Virtual Character “在场”和“共情”——设计和实现虚拟角色的情感互动故事
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561632
Manyu Zhang
{"title":"“Presence” and “Empathy” — Design and Implementation Emotional Interactive Storytelling for Virtual Character","authors":"Manyu Zhang","doi":"10.1145/3561613.3561632","DOIUrl":"https://doi.org/10.1145/3561613.3561632","url":null,"abstract":"One of the key motivators for participating in Virtual Reality (VR) is the opportunity to and appeal of becoming immersed in a virtual environment. One avenue that is anticipated to have significant expansion is storytelling through VR, as it offers novel and absorbing experiences. To develop a design interactive storytelling program using VR-based coding, examples of VR application and coding storytelling were analyzed. Base on this analysis, we developed one design interactive storytelling featuring a virtual environment that supports the facilitation of such experiences. In this paper, we introduce expands the interactive storytelling structure, both in general and for VR. The current interactive storytelling systems are extended via emotional modeling and tracking. The components being proposed are to supplement the story segments with information about the response anticipated from users, a modeled emotional path for the individual emotional categories linked to the story, and an internal system to track emotions, in a bid to predict the users’ present emotional condition. We also show the results of the implementation with the 43 students (age 18-28) that demonstrate the emotional expression for the use of interactive storytelling. The results showed that virtual interactive storytelling, the usability of the system and the impact of plot development on inference and story understanding.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130212809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hybrid-Spatial Transformer for Image Captioning 用于图像字幕的混合空间转换器
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561617
Jincheng Zheng, Chi-Man Pun
{"title":"Hybrid-Spatial Transformer for Image Captioning","authors":"Jincheng Zheng, Chi-Man Pun","doi":"10.1145/3561613.3561617","DOIUrl":"https://doi.org/10.1145/3561613.3561617","url":null,"abstract":"Recent years, the transformer-based model has achieved great success in many tasks such as machine translation. This encoder-decoder architecture is proved to be useful for image captioning tasks as well. We propose a novel Hybrid-Spatial Transformer model for image captioning. In this work, we combine the Global information and Local information of image as input of encoder which extracted by VGG16 and Faster R-CNN respectively. To further improve the performance of model, we add spatial information to attention layer by incorporating geometry features to attention weight. What’s more, queries Q, keys K, values V are a bit different from standard transformer, which is reflected in theses aspects. The positional encoding or embedding is not added to values V both encoder and decoder, the positional embedding is added to keys K on cross-attention. The experimental results illustrate that our model can achieve state-of-the art performance on CIDEr-D, METEROR and BLEU-1 on MS-COCO dataset.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122456074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency Domain Spline Prioritization Optimization Adaptive Filters 频域样条优先级优化自适应滤波器
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561645
Wenyan Guo, Yongfeng Zhi, Zhe Zhang, Honggang Gao
{"title":"Frequency Domain Spline Prioritization Optimization Adaptive Filters","authors":"Wenyan Guo, Yongfeng Zhi, Zhe Zhang, Honggang Gao","doi":"10.1145/3561613.3561645","DOIUrl":"https://doi.org/10.1145/3561613.3561645","url":null,"abstract":"The spline prioritization optimization adaptive filter (SPOAF) is a nonlinear filtering algorithm with a relatively simple architecture. It is composed of the FIR filter cascaded a nonlinear interpolation module. When the length of the FIR filter is long, the computational complexity will increase exponentially. To solve this problem, this paper proposes a frequency domain spline prioritization optimization adaptive filter (FDSPOAF). More specifically, the FIR filter is implemented in the frequency domain, using the fast Fourier transform and its inverse transform, which converts convolution in the time domain into multiplication in the frequency domain. This paper describes the detailed steps of the FDSPOAF method and analyzes the computational complexity. Finally, it is verified by numerical experiments that the algorithm can reduce the operation time. Compared with the traditional SPOAF algorithm, the proposed FDSPOAF algorithm can effectively reduce the operation time of the algorithm with comparable convergence performance.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132440114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Masked Face Recognition Using MobileNetV2 蒙面人脸识别使用MobileNetV2
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561650
Ming Liu, Wei Yan
{"title":"Masked Face Recognition Using MobileNetV2","authors":"Ming Liu, Wei Yan","doi":"10.1145/3561613.3561650","DOIUrl":"https://doi.org/10.1145/3561613.3561650","url":null,"abstract":"Masked face recognition has made great progress in the field of computer vision since the popularity of COVID-19 epidemic in 2020. In countries with severe outbreaks, people are required to wear masks in public. The current face recognition methods, which take use of the whole face as input data, are quite well established. However, while people are use of face masks, it will reduce the accuracy of face recognition. Therefore, we propose a mask wearing recognition method based on MobileNetV2 and solve the problem that many of models cannot be applied to portable devices or mobile terminals. The results indicate that this method has 98.30% accuracy in identifying the masked face. Simultaneously, a higher accuracy is obtained compared to VGG16. This approach has proven to be working well for the practical needs.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129325737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pulmonary Nodule Detection Based on RPN with Squeeze‐and‐Excitation Block 基于挤压-兴奋阻滞的RPN检测肺结节
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561627
Xiaoxi Lu, Xingyue Wang, Jiansheng Fang, Na Zeng, Yao Xiang, Jingfeng Zhang, Jianjun Zheng, Jiang Liu
{"title":"Pulmonary Nodule Detection Based on RPN with Squeeze‐and‐Excitation Block","authors":"Xiaoxi Lu, Xingyue Wang, Jiansheng Fang, Na Zeng, Yao Xiang, Jingfeng Zhang, Jianjun Zheng, Jiang Liu","doi":"10.1145/3561613.3561627","DOIUrl":"https://doi.org/10.1145/3561613.3561627","url":null,"abstract":"Early detection of lung cancer is a crucial step to improve the chances of survival. To detect the pulmonary nodules, various methods are proposed including one-stage object detection methods (e.g., YOLO, SSD) and two-stage detection methods(e.g., Faster RCNN). Two-stage methods are more accurate than one-stage, thus more likely used in the detection of a small object. Faster RCNN as a two-stage method, ensuring more efficient and accurate region proposal generation, is consistent with our task’s objective, that is, detecting small 3-D nodules from large CT image volume. Therefore, in our work, we used 3-D region proposal network (RPN) proposed in Faster RCNN to detect nodules. However, different from natural images with clear boundaries and textures, pulmonary nodules have different types and locations, which are hard to recognize. Thus with the thought that if the network can learn more features of the nodules, the performance would be better, we also applied the \"Squeeze-and-Excitation\" blocks to the 3-D RPN, which we term it as SE-Res RPN. The experimental results show that the sensitivity of SE-Res RPN in 10-fold cross-validation of LUNA 16 is 93.7 , which achieves great performance without a false positive reduction stage.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114886270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-stage Citrus Detection based on Improved Yolov4 基于改进Yolov4的柑橘多阶段检测
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561623
Bingliang Yi, Bin Kong, C. Xu
{"title":"Multi-stage Citrus Detection based on Improved Yolov4","authors":"Bingliang Yi, Bin Kong, C. Xu","doi":"10.1145/3561613.3561623","DOIUrl":"https://doi.org/10.1145/3561613.3561623","url":null,"abstract":"At present, the research of Citrus recognition is basically aimed at the detection of Citrus in mature stage. This paper proposes a citrus detection algorithm based on improved yolov4, which can detect citrus in each growth stage. Based on yolov4, Introducing CBAM attention mechanism to improve the feature extraction ability of backbone networks; Increase the 22nd layer output of feature extraction network to improve the small target detection rate; A short connection feature fusion method is designed to increase the utilization of shallow feature information; Add a detection head with a scale of 152 * 152 for small-scale targets. It is proved by experiments on the self-built citrus data set, the improved CBAM-F-YOLOv4 can effectively detect citrus in each stage, and the mean Average Precision (mAP) is 6.2 percentage points higher than the original algorithm, reaching 87.3%. The detection results show that the improved algorithm greatly improves the detection ability of occlusion、 overlap and small-scale citrus.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128450691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ski Fall Detection from Digital Images Using Deep Learning 利用深度学习从数字图像中检测滑雪摔倒
Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI: 10.1145/3561613.3561625
Yulin Zhu, Wei Yan
{"title":"Ski Fall Detection from Digital Images Using Deep Learning","authors":"Yulin Zhu, Wei Yan","doi":"10.1145/3561613.3561625","DOIUrl":"https://doi.org/10.1145/3561613.3561625","url":null,"abstract":"In this paper, we explore how to take advantage of computer vision to assist ski resorts and monitor the safety of skiers on the tracks. In order to quickly detect any falls or injures, and provide first aid for injured people, we make use of archived ski videos, which are employed to explore the possibility of skiers fall detection. Throughout combinations of visual object detection with human pose detection by using deep learning methods. Our ultimate goal of this project is to provide a way for ski safety monitoring which has potential applications for physical training. Our contribution in this paper is to propose a fall detection method suitable for skiers based on visual object detection, we have obtained 0.94 mAP accuracy in preliminary tests.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130604182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信