Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition最新文献

筛选
英文 中文
An Augmented Reality Tracking Registration Method Based on Deep Learning 基于深度学习的增强现实跟踪配准方法
Xingya Yan, Guangrui Bai, Chaobao Tang
{"title":"An Augmented Reality Tracking Registration Method Based on Deep Learning","authors":"Xingya Yan, Guangrui Bai, Chaobao Tang","doi":"10.1145/3573942.3574034","DOIUrl":"https://doi.org/10.1145/3573942.3574034","url":null,"abstract":"Augmented reality is a three-dimensional visualization technology that can carry out human-computer interaction. Virtual information is placed in the designated area of the real world to enhance real-world information. Based on the existing implementation process of augmented reality, this paper proposes an augmented reality method based on deep learning, aiming at the inaccurate positioning and model drift of the augmented reality method without markers in complex backgrounds, light changes, and partial occlusion. The proposed method uses the lightweight SSD model for target detection, the SURF algorithm to extract feature points and the FLANN algorithm for feature matching. Experimental results show that this method can effectively solve the problems of inaccurate positioning and model drift under particular circumstances while ensuring the operational efficiency of the augmented reality system.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122299187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual Correlation Filter Tracking for UAV Based on Temporal and Spatial Regularization with Boolean Maps 基于布尔映射时空正则化的无人机视觉相关滤波跟踪
Na Li, Jiale Gao, Y. Liu, Yansheng Zhu, Wenhan Jiang
{"title":"Visual Correlation Filter Tracking for UAV Based on Temporal and Spatial Regularization with Boolean Maps","authors":"Na Li, Jiale Gao, Y. Liu, Yansheng Zhu, Wenhan Jiang","doi":"10.1145/3573942.3574036","DOIUrl":"https://doi.org/10.1145/3573942.3574036","url":null,"abstract":"Object tracking is now widely used in sports event broadcasting, security surveillance, and human-computer interaction. It is a challenging task for tracking on unmanned aerial vehicle (UAV) datasets due to many factors such as illumination change, appearance modification, occlusion, motion blur and so on. To solve the problem, a visual correlation filter tracking algorithm based on temporal and spatial regularization is proposed. It employs boolean maps to obtain visual attention, and fuses different features such as color names (CN), histogram of oriented gradient (HOG) and Gray features to enhance the visual representation. New object occlusion judgment method and model update strategy are put forward to make the tracker more robust. The proposed algorithm is compared with other six trackers in terms of distant precision and success rate on UAV123. And the experimental results show that it achieves more stable and robust tracking performance.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129410519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of PM2.5 on the Detection Performance of Quantum Interference Radar PM2.5对量子干涉雷达探测性能的影响
Lihao Tian, Min Nie, Guang Yang
{"title":"Effects of PM2.5 on the Detection Performance of Quantum Interference Radar","authors":"Lihao Tian, Min Nie, Guang Yang","doi":"10.1145/3573942.3574117","DOIUrl":"https://doi.org/10.1145/3573942.3574117","url":null,"abstract":"In order to study the influence of PM2.5 particles on the detection performance of quantum interference radar, this article analyzes the relationship between the concentration of PM2.5 particles and the extinction coefficient under different particle sizes based on the spectral distribution function of PM2.5 particles and the Mie scattering theory. Then establish the influence model of PM2.5 particles on the detection distance and maximum detection error probability of quantum interference radar. The simulation results show that as the concentration of PM2.5 particles increases, the extinction coefficient of PM2.5 particles shows a gradually increasing trend; the energy of the detected photons is attenuated, resulting in a decrease in the transmission distance of the photons; when the energy of the emitted photons remains unchanged, The maximum detection error probability of quantum interference radar increases with the increase of PM2.5 particle concentration; when the PM2.5 particle concentration remains unchanged, the maximum detection error probability decreases gradually with the increase of the emitted photon energy. Therefore, the average number of emitted photons should be appropriately adjusted according to PM2.5 pollution in order to reduce the impact of PM2.5 atmospheric pollution on the detection performance of quantum interference radar.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129346892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single Image Dehazing Via Enhanced CycleGAN 通过增强型CycleGAN实现单幅图像去雾
Sheping Zhai, Yuanbiao Liu, Dabao Cheng
{"title":"Single Image Dehazing Via Enhanced CycleGAN","authors":"Sheping Zhai, Yuanbiao Liu, Dabao Cheng","doi":"10.1145/3573942.3574097","DOIUrl":"https://doi.org/10.1145/3573942.3574097","url":null,"abstract":"Due to the influence of atmospheric light scattering, the images acquired by outdoor imaging device in haze scene will appear low definition, contrast reduction, overexposure and other visible quality degradation, which makes it difficult to handle the relevant computer vision tasks. Therefore, image dehazing has become an important research area of computer vision. However, existing dehazing methods generally require paired image datasets that include both hazy images and corresponding ground truth images, while the recovered images are easy to occur color distortion and detail loss. In this study, an end-to-end image dehazing method based on Cycle-consistent Generative Adversarial Networks (CycleGAN) is proposed. For effectively learning the mapping relationship between hazy images and clear images, we refine the transformation module of the generator by weighting optimization, which can promote the network adaptability to scale. Then in order to further improve the quality of generated images, the enhanced perceptual loss and low-frequency loss combined with image feature attributes are constructed in the overall optimization objective of the network. The experimental results show that our dehazing algorithm effectively recovers the texture information while correcting the color distortion of original CycleGAN, and the recovery effect is clear and more natural, which better reduces the influence of haze on the imaging quality.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129490217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyperspectral Anomaly Detection based on Autoencoder using Superpixel Manifold Constraint 基于超像素流形约束的自编码器高光谱异常检测
Yuquan Gan, Wenqiang Li, Y. Liu, Jinglu He, Ji Zhang
{"title":"Hyperspectral Anomaly Detection based on Autoencoder using Superpixel Manifold Constraint","authors":"Yuquan Gan, Wenqiang Li, Y. Liu, Jinglu He, Ji Zhang","doi":"10.1145/3573942.3574108","DOIUrl":"https://doi.org/10.1145/3573942.3574108","url":null,"abstract":"In the field of hyperspectral anomaly detection, autoencoder (AE) have become a hot research topic due to their unsupervised characteristics and powerful feature extraction capability. However, autoencoders do not keep the spatial structure information of the original data well during the training process, and is affected by anomalies, resulting in poor detection performance. To address these problems, a hyperspectral anomaly detection method based on autoencoders with superpixel manifold constraints is proposed. Firstly, superpixel segmentation technique is used to obtain the superpixels of the hyperspectral image, and then the manifold learning method is used to learn the embedded manifold that based on the superpixels. Secondly, the learned manifold constraints are embedded in the autoencoder to learn the potential representation, which can maintain the consistency of the local spatial and geometric structure of the hyperspectral images (HSI). Finally, anomalies are detected by computing reconstruction errors of the autoencoder. Extensive experiments are conducted on three datasets, and the experimental results show that the proposed method has better detection performance than other hyperspectral anomaly detectors.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123665287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Dialogue Generation Based on Transformer and Collaborative Attention 基于变压器和协同注意的多模态对话生成
Wei Guan, Zhen Zhang, Li Ma
{"title":"Multimodal Dialogue Generation Based on Transformer and Collaborative Attention","authors":"Wei Guan, Zhen Zhang, Li Ma","doi":"10.1145/3573942.3574091","DOIUrl":"https://doi.org/10.1145/3573942.3574091","url":null,"abstract":"In view of the fact that the current multimodal dialogue generation models are based on a single image for question-and-answer dialogue generation, the image information cannot be deeply integrated into the sentences, resulting in the inability to generate semantically coherent, informative visual contextual dialogue responses, which further limits the application of multimodal dialogue generation models in real scenarios. This paper proposes a Deep Collaborative Attention Model (DCAN) method for multimodal dialogue generation tasks. First, the method globally encode the dialogue context and its corresponding visual context information respectively; second, to guide the simultaneous learning of interactions between image and text multimodal representations, after the visual context features are fused with the dialogue context features through the collaborative attention mechanism, the hadamard product is used to fully fuse the multimodal features again to improve the network performance; finally, the fused features are fed into a transformer-based decoder to generate coherent, informative responses. in order to solve the problem of continuous dialogue in multimodal dialogue, the method of this paper uses the OpenVidial2.0 data set to conduct experiments. The results show that the responses generated by this model have higher correlation and diversity than existing comparison models, and it can effectively integrate visual context information.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114528444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voicifier-LN: An Novel Approach to Elevate the Speaker Similarity for General Zero-shot Multi-Speaker TTS Voicifier-LN:一种提高一般零射多扬声器TTS中说话人相似度的新方法
Dengfeng Ke, Liangjie Huang, Wenhan Yao, Ruixin Hu, Xueyin Zu, Yanlu Xie, Jinsong Zhang
{"title":"Voicifier-LN: An Novel Approach to Elevate the Speaker Similarity for General Zero-shot Multi-Speaker TTS","authors":"Dengfeng Ke, Liangjie Huang, Wenhan Yao, Ruixin Hu, Xueyin Zu, Yanlu Xie, Jinsong Zhang","doi":"10.1145/3573942.3574120","DOIUrl":"https://doi.org/10.1145/3573942.3574120","url":null,"abstract":"Speeches generated from neural network-based Text-to-Speech (TTS) have been becoming more natural and intelligible. However, the evident dropping performance still exists when synthesizing multi-speaker speeches in zero-shot manner, especially for those from different countries with different accents. To bridge this gap, we propose a novel method, called Voicifier. It firstly operates on high frequency mel-spectrogram bins to approximately remove the content and rhythm. Then Voicifier uses two strategies, from the shallow to the deep mixing, to further destroy the content and rhythm but retain the timbre. Furthermore, for better zero-shot performance, we propose Voice-Pin Layer Normalization (VPLN) which pins down the timbre according with the text feature. During inference, the model is allowed to synthesize high quality and similarity speeches with just around 1 sec target speech audio. Experiments and ablation studies prove that the methods are able to retain more target timbre while abandoning much more of the content and rhythm-related information. To our best knowledge, the methods are found to be universal that is to say it can be applied to most of the existing TTS systems to enhance the ability of cross-speaker synthesis.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114620721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental Encoding Transformer Incorporating Common-sense Awareness for Conversational Sentiment Recognition 基于常识感知的增量编码转换器用于会话情感识别
Xiao Yang, Xiaopeng Cao, Hao Liang
{"title":"Incremental Encoding Transformer Incorporating Common-sense Awareness for Conversational Sentiment Recognition","authors":"Xiao Yang, Xiaopeng Cao, Hao Liang","doi":"10.1145/3573942.3573965","DOIUrl":"https://doi.org/10.1145/3573942.3573965","url":null,"abstract":"Conversational sentiment recognition has been widely used in people's lives and work. However, machines do not understand emotions through common-sense cognition. We propose an Incremental Encoding Transformer Incorporating Common-sense Awareness (IETCA) model. The model helps the machines use common-sense knowledge to better understand emotions in conversation. The model uses a context-aware graph attention mechanism to obtain knowledge-rich utterance representations and uses an incremental encoding Transformer to get rich contextual representations. We do some experiments on five datasets. The results show that the model has some improvement in conversational sentiment recognition.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"169 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113987211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dual-Task Deep Neural Network for Scene and Action Recognition Based on 3D SENet and 3D SEResNet 基于3D SENet和3D SEResNet的场景和动作识别双任务深度神经网络
Zhouzhou Wei, Yuelei Xiao
{"title":"A Dual-Task Deep Neural Network for Scene and Action Recognition Based on 3D SENet and 3D SEResNet","authors":"Zhouzhou Wei, Yuelei Xiao","doi":"10.1145/3573942.3574077","DOIUrl":"https://doi.org/10.1145/3573942.3574077","url":null,"abstract":"Aiming at the problem that scene information will become noise and cause interference in the feature extraction stage of action recognition, a dual-task deep neural network model for scene and action recognition is proposed. The model first uses a convolutional layer and max pooling layer as shared layers to extract low-dimensional features, then uses 3D SEResNet for action recognition and 3D SENet for scene recognition, and finally outputs their respective results. In addition, to solve the problem that the existing public dataset is not associated with the scene, a scene and action dataset (SAAD) for recognition is built by ourselves. Experimental results show that our method performs better than other methods on SAAD dataset.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127736322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Network Prediction Model Based on Differential Localization 基于差分定位的神经网络预测模型
Yuanhua Liu, Ruini Li, Xinliang Niu
{"title":"Neural Network Prediction Model Based on Differential Localization","authors":"Yuanhua Liu, Ruini Li, Xinliang Niu","doi":"10.1145/3573942.3573960","DOIUrl":"https://doi.org/10.1145/3573942.3573960","url":null,"abstract":"The Global Navigation Satellite System-Reflectometry (GNSS-R) is affected by buildings, trees, etc. during the transmission process, which generates large errors. The traditional method is to use differential to eliminate most of the errors to improve positioning accuracy. In this paper, a neural network prediction model based on differential results is proposed, which uses the differential results X, Y and Z as the inputs of the neural network to predict the satellite position, and finally compare it with the real value. The paper uses Artificial Neural Network (ANN), Recurrent Neural Network (RNN) and Long Short Term Memory-Recurrent Neural Network (LSTM-RNN) are used to establish training models and make predictions. The results show that compared with the ANN model, the Mean Absolute Percentage Error (MAPE) and Root Mean Squared Error (RMSE) of the RNN model are reduced by 1.54% and 3.59%, respectively; compared with the RNN model, the MAPE and RMSE of the LSTM-RNN model are reduced by 21.16% and 14.81%, respectively, which proves that the training accuracy and fit of the LSTM-RNN are better.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126429399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信