{"title":"Improving the Detection Performance of Sparse R-CNN with Different Necks","authors":"Zhaodong Zheng, Zefeng Zhang, Miao Fan, Lilian Huang","doi":"10.1145/3577117.3577135","DOIUrl":"https://doi.org/10.1145/3577117.3577135","url":null,"abstract":"Sparse R-CNN uses a purely sparse method to detect objects and achieves good results. However, it does not make full use of the features extracted from the image, so its detection performance needs to be further improved. And we propose Sparse R-CNNv1 and Sparse R-CNNv2. In these algorithms, we use VOVNet with attention mechanism to replace ResNet of the original Sparse R-CNN as our backbone. In addition, we also use two different improved neck networks, Augpan and FPNencoder, to further improve the detection performance of the algorithm from the perspective of feature fusion and increasing the receptive field of each layer, respectively. Our algorithms are trained and verified on COCO2017, and the experimental results show that Sparser-CNNv1 achieves 45.0 AP and Sparser-CNNV2 achieves 45.3 AP, higher than the original SparseR-CNN's 43.0 AP in standard 3× training schedule.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"409 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122805081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on Extraction Method of Shoreline Based on Airborne Remote Sensing Image","authors":"Rong Chen, B. Jia, Jin Xu, Bo Li, Shuai Xu","doi":"10.1145/3577117.3577125","DOIUrl":"https://doi.org/10.1145/3577117.3577125","url":null,"abstract":"Shoreline information extraction is an important basis for marine supervision and development. Due to the high-time-cost and low-extraction-accuracy of the traditional shoreline extraction method based on field investigation and artificial interpretation, a new shoreline extraction method based on airborne remote sensing and segmentation technology was proposed. First, the visible and thermal infrared images were matched into the same size and the two images were fused by the method of image processing. Then, the triangle threshold method was used to segment the image for separating the land and the sea. Finally, the shorelines were extracted by removing speckles. The experiment result shown that the segmentation algorithm based on UAV remote sensing image is simple and easy to realize, which can extract shorelines quickly, accurately and intelligently.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130562340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Li, Jin Xu, Xinxiang Pan, Rong Chen, Shuai Xu, Haixia Wang
{"title":"Highlight Target Extraction Method based on X-band Shipborne Radar Image","authors":"Bo Li, Jin Xu, Xinxiang Pan, Rong Chen, Shuai Xu, Haixia Wang","doi":"10.1145/3577117.3577121","DOIUrl":"https://doi.org/10.1145/3577117.3577121","url":null,"abstract":"Shipborne radar is an indispensable target detection instrument during navigation. Based on the shipborne radar image collected by the Yukun teaching-training ship of Dalian Maritime University, a highlight targets extraction method is proposed here. First, the original shipborne radar image is transformed from polar coordinate system into Cartesian coordinate system. Second, the improved Sobel operator is used to convolute the shipborne radar image in Cartesian coordinate system. Third, Otsu threshold is used to segment the convoluted image and extract the co-frequency interference noises. Fourth, the distance weighted linear filter is used to suppress the co-frequency interference noises. Fifth, the gray correction matrix is used to adjust the overall gray level of the image. Then, the threshold method is used to get the rough highlight targets segmentation. After that, the speckle noises are suppressed by the pixel number threshold. Finally, the result image is transformed back to the polar coordinate system. Experimental results show that our method can effectively segment the highlighted targets in the original shipborne radar image.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130086310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaze Zhang, Shengmao Zhang, Shuxian Wang, Yongwen Sun, Yifan Song
{"title":"Research on Target Counting Based on Improved YOLOv5 and SORT Algorithms","authors":"Jiaze Zhang, Shengmao Zhang, Shuxian Wang, Yongwen Sun, Yifan Song","doi":"10.1145/3577117.3577146","DOIUrl":"https://doi.org/10.1145/3577117.3577146","url":null,"abstract":"In order to solve the statistical problem of some targets when fishing vessels are operating, based on deep learning technology, this paper uses the improved YOLOv5s and SORT algorithms to achieve target statistics. First, YOLOv5s is fused with CBAM and SE attention mechanism modules, respectively, to reduce the interference of complex backgrounds and improve the model detection accuracy simultaneously. Comparing the three models, the target detection model with a better effect is selected. Secondly, through the threshold method, SORT algorithm combined with the detection line and DeepSORT three algorithms to achieve the count of some targets, respectively. The results show that the accuracies of YOLOv5s, YOLOv5s fused CBAM, and YOLOv5s fused SE are 97.2%, 84.8%, and 98.9%, respectively. Among them, the YOLOv5s fusion SE module has the best effect, which is 1.7% and 14.1% higher than the other two results. Among the three target statistics methods, the SORT algorithm combined with the detection line is the best, with an average count accuracy rate of 85.7%. The count accuracy rates of the three categories of Fish_basket, Fish_net, and Process_ship are 96.5%, 85.8%, and 75%, respectively, compared with the other two species have improved significantly. The research results can provide an informational reference for the automated counting of targets during fishing vessel operations.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115366127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Genetic Approach to the Formulation of Tetris Engine","authors":"Hongtao Zhang","doi":"10.1145/3577117.3577134","DOIUrl":"https://doi.org/10.1145/3577117.3577134","url":null,"abstract":"The game Tetris is a great and famous topic for research in artificial intelligence and machine learning. Many investigations have already existed. However, we believe more things can be learned from this topic, and there is still space to improve. This paper will tackle the Tetris game using three different agents, the handcrafted, local search and reinforcement learning agents. We will implement, compare and analyze all three agents to understand their advantages and disadvantages. In brief, the main result is that the local search agent turns out to be the most successful agent, which performs ten times better than the handcrafted agent and five times better than the reinforcement learning agent. The main result implies two take-away messages. Firstly, sometimes the simple model is the optimal model. Secondly, we should be cautious when using a Convolutional Neural Network (CNN) to encode game state because of its spatial invariance property.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129412464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Multi-Attention Network for Single Image Super-resolution","authors":"Zhang Tao, Kai Zeng, Jiachun Zheng, Xiangyu Yu","doi":"10.1145/3577117.3577126","DOIUrl":"https://doi.org/10.1145/3577117.3577126","url":null,"abstract":"Recent research on single image super-resolution(SISR) shows that deep convolutional neural networks(DCNNs) with attention mechanism present a better improvement. Each different attention mechanism has its distinct focus. Specifically, channel attention mechanism has the capacity to enhance the influence of critical channels by focusing on the expression of characteristics at different channel levels, and pixel attention mechanism has the ability to improve the quality of reconstructed images by paying attention to the expression of spatial pixel features. We believe that the combination of these two mechanisms is a way to further improve the quality of super-resolution image. In this paper, an enhanced multi-attention network(EMAN) is proposed, which contains advantages of two attention mechanisms. Besides, to improve the utilization of high-frequency information, a novel edge-based loss function is added to boost the learning of the edge region. Plenty of experiments show that the proposed multi-attention network achieves better accuracy and visual effect against single-attention methods.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127367582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SAR Image Segmentation with Superpixel Constraint and Fuzzy Clustering","authors":"Zhenzhen Wan, Chaoshu Jiang, Jiawen Kang, Xiaojie Qu, Xiangtao Min, Xiaoyu Zhang","doi":"10.1145/3577117.3577136","DOIUrl":"https://doi.org/10.1145/3577117.3577136","url":null,"abstract":"Image segmentation is a very important task in the application of synthetic aperture radar (SAR) images, especially to feature extraction of SAR images. Because of speckle noise in SAR images, it is easy to produce many isolated points if fuzzy clustering is performed directly on SAR images. Aiming at this, a SAR image segmentation method based on superpixel constraints and fuzzy clustering is proposed in this paper, which is named FCM_SS. The FCM_SS algorithm firstly introduces the improved SNIC algorithm to produce uniform superpixels, and then averages the pixels in the superpixels, which are used as the input for subsequent fuzzy clustering. The experimental results suggest that the FCM_SS algorithm has high segmentation accuracy and strong robustness to noise.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123544186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Intelligent Defect Detection Algorithm for PCB based on Deep Learning","authors":"Xiangyuan Zhu, Xiuchun Xiao, Zhiming Lan, Qi Hong, Miao Hou","doi":"10.1145/3577117.3577142","DOIUrl":"https://doi.org/10.1145/3577117.3577142","url":null,"abstract":"As an essential component of modern machines, printed circuit board (PCB) is widely used in various electronic products. Its quality significantly affects the quality of products. However, the production process of PCB is often accompanied with defects. In this paper, a defect detection algorithm is proposed. Data augmentation such as flipping, shifting, brightness adjustment, rotation, and Guass noise are carried out to diversify the dataset. You only look once (YOLO) v5s is then introduced to detect the PCB defects. Through parameter tuning and optimization, a trained detection model is achieved. F1-score and mean average precision (mAP) are used to assess the performance of the model. The experiment results show that the mAP and F1-score are 99.3% and 99.0%, respectively. The model developed based on YOLO-v5s algorithm can achieve superior performance, which is competent to detect the defects of PCBs.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116513011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Zhang, Zhihua Gan, Yang Yang, Wenbin Jiang, Xin He, Xiu-li Chai
{"title":"A New Thumbnail Preserving Encryption Scheme","authors":"Jie Zhang, Zhihua Gan, Yang Yang, Wenbin Jiang, Xin He, Xiu-li Chai","doi":"10.1145/3577117.3577145","DOIUrl":"https://doi.org/10.1145/3577117.3577145","url":null,"abstract":"Cloud services can store a large number of images, but cannot protect the security of user privacy. The traditional image encryption scheme improves the privacy security, but reduces the visibility of the image, which makes the image invisible in the cloud service. To maintain both cloud storage security and visual usability, Tajik et al. proposed a thumbnail preserving encryption (TPE) scheme so that the encrypted image is presented as a low-resolution version of the plaintext image. However, this scheme only scrambles and replaces pixels, and has low security. Aiming at the problem of low security of this method, this paper proposes a new thumbnail preserving encryption scheme (New-TPE). The plaintext image information is used as part of the key to construct the correlation between the encryption process and the plaintext image; The optimal random value is selected from multiple random values and is added to the pixel value replacement process, which increases the degree of variation of pixel values before and after the replacement and reduces the correlation between adjacent pixels; The simulated annealing idea is introduced to select the optimal scrambling sequence within the range; A displacement function scrambling method is proposed to reduce the correlation between adjacent pixels. The experimental results show that the proposed scheme has higher security and a better balance between image privacy and security.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130439783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyi Zhang, Tianxing Wang, Xiankun Song, Yanqing Wang
{"title":"The Design and Implementation of the Natural Handwriting Mathematical Formula Recognition System","authors":"Zhiyi Zhang, Tianxing Wang, Xiankun Song, Yanqing Wang","doi":"10.1145/3577117.3577123","DOIUrl":"https://doi.org/10.1145/3577117.3577123","url":null,"abstract":"The traditional handwritten mathematical formula recognition mode has many shortcomings in recognition. For example, the recognition rate is low, the operation is complicated and other pain points. In order to make handwritten mathematical formula recognition more accurate and easy to use, and to solve the problem of low efficiency of editing data formulas for researchers engaged in mathematics-related professions. In this paper, a natural handwritten mathematical formula recognition system with one-click operation and higher recognition is implemented. The system uses a core algorithm to separate the target formulas based on histogram projection and dynamic comparison word method, and a three-layer convolutional neural network model based on CNN to recognize the segmented strings. Experiments show that the improved algorithm based on the algorithm has strong learning ability and robustness.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129387542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}