{"title":"Reconstructing 3D Shapes as an Union of Boxes from Multi-View Images","authors":"Zihan Yang, Minglun Gong","doi":"10.1145/3609703.3609705","DOIUrl":"https://doi.org/10.1145/3609703.3609705","url":null,"abstract":"The task of reconstructing object shapes from input images has become increasingly important in various fields, such as computer vision, robotics, augmented reality, video games, and autonomous vehicles. While approaches for reconstructing shapes with varying levels of detail have been proposed, balancing representation accuracy and model complexity remains a challenge. To address this challenge, we propose an end-to-end approach for reconstructing object shapes from multiple images using a union of box primitives. Our approach offers a simpler and more efficient 3D representation of objects without the need for intermediate products such as voxels, resulting in faster inference times. Additionally, we introduce an auxiliary task to aid in learning how to extract and transform spatial features from images without requiring camera calibrations. Extensive experiments demonstrate that our method can produce comparable results to approaches that require 3D voxelized input while utilizing only 2D RGB images as input. Furthermore, our method significantly outperforms the aforementioned approaches in terms of inference time.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115668506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification-Dissemination-Warning: Algorithm and Prediction of Early Warning of Network Public Opinion","authors":"Lin Sun","doi":"10.1145/3609703.3609723","DOIUrl":"https://doi.org/10.1145/3609703.3609723","url":null,"abstract":"In order to better monitor public opinion, this study reviews how the existing thesis work theoretically and practically from the interdisciplinary perspective of communication and computer science and then proposes a new vision under the framework of “identification-dissemination-warning”. Real-life applications include news reports and social media in data collection, emphasizing attitude analysis, building an evaluation system that consisted of four parameters, i.e., event, dissemination, status, and response, determine how serious an emergency is based on how fast public opinion will deteriorate and provide response guidance accordingly. On the theoretical front, this study takes into account the inter-influence between different parameters and optimize semantic analysis, emergency grading and nonlinear processing with the help of Bayesian network, hierarchical network models, grey relational analysis, latent semantic analysis and BP neural network.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114616440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinhan Chen, Yitong Song, Jixiang Zhu, Sheng-Kai Wang
{"title":"Multi-population Runge Kutta Optimizer Based on Gaussian Disturbance","authors":"Jinhan Chen, Yitong Song, Jixiang Zhu, Sheng-Kai Wang","doi":"10.1145/3609703.3609713","DOIUrl":"https://doi.org/10.1145/3609703.3609713","url":null,"abstract":"To address the lack of development capacity of Runge Kutta Optimizer, we propose the Multi-population Runge Kutta algorithm Based on Gaussian disturbance(MPRUN). In the algorithm, the population is divided into subgroups. The individuals in the subgroups are randomly selected for a global search with decreasing search radius with the number of iterations, which is used to improve the global search ability of the subgroups. In addition, the algorithm introduces a Gaussian disturbance mechanism to generate more uniformly distributed populations, performing random perturbation to the global best individual. Finally, the performance of the optimized algorithm is verified by 30 test set functions.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122809226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gong Cheng, Xi Yong, Xin Lyu, Tao Zeng, Xinyu Wang, Jiale Chen, Xin Li
{"title":"MSYOLOF: Multi-input-single-output encoder network with tripartite feature enhancement for object detection","authors":"Gong Cheng, Xi Yong, Xin Lyu, Tao Zeng, Xinyu Wang, Jiale Chen, Xin Li","doi":"10.1145/3609703.3609710","DOIUrl":"https://doi.org/10.1145/3609703.3609710","url":null,"abstract":"Object detection under one-level feature is a challenging task, which requires that object representations at different scales can be extracted on a single feature map. However, existing object detectors using a one-level feature suffer from inadequate of different-scale object representations resulting in low accuracy for multi-scale object detection, especially for smaller objects. To address the problem above-mentioned, a novel object detector named MSYOLOF, is proposed to construct an effective single feature map for detecting objects of different scales. In the proposed network, three modules are proposed to bring considerable improvements, namely Feature Pyramid Pooling (FPP), Feature Perception Enhancement (FPE), and Dual Branch Receptive Field (DBRF). Firstly, the FPP module aggregates contextual information from various regions to improve the network's ability to achieve global information, which strengthens the model's understanding of the overall scene. Then, the FPE module utilizes coordinate attention to construct a residual block to obtain orientation-aware and position-sensitive information, making the network efficient in accurately locating and identifying objects of interest. Third, by rethinking the Dilated Encoder of YOLOF, the DBRF module reduces information loss and mitigates the problem of being sensitive only to large objects when dilated convolution utilizes large expansion rates. Extensive experiments are conducted on COCO benchmark to validate the effectiveness of our network, which exhibits superior performance compared to other state-of-the-art networks.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117075465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng-hao Huo, Ziyin Li, Ruide Qu, Xiaodong Wang, Fei Ye, Jun Jin, Xiaojuan Yao
{"title":"Fiber Recognition Algorithm Based on Improved Mask RCNN","authors":"Zheng-hao Huo, Ziyin Li, Ruide Qu, Xiaodong Wang, Fei Ye, Jun Jin, Xiaojuan Yao","doi":"10.1145/3609703.3609719","DOIUrl":"https://doi.org/10.1145/3609703.3609719","url":null,"abstract":"In response to the application requirements of identifying and classifying multiple types of fibers, this paper proposes a fiber recognition algorithm based on improved Mask RCNN to achieve recognition and classification of multiple types of fibers, reduce the labor cost of fiber inspection, and improve inspection efficiency and quality. Firstly, a data augmentation strategy is adopted, which combines three data augmentation methods: RandomFlip, RandomCrop, and Cutout to achieve the best increase in network performance; Subsequently, a multi-scale training strategy is introduced to improve the model's training efficiency while also enhancing its robustness to scale; Finally, the attention mechanism module of convolutional blocks is added to solve the problem of low recognition and classification accuracy caused by small differences in fine-grained granularity between certain fiber classes. The experimental results show that the algorithm achieves a recognition and classification accuracy of 97.87% on the test set by introducing techniques such as data augmentation, multi-scale training, and CBAM, significantly improving the recognition and classification accuracy of various fiber targets.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134142358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on vehicle spare parts demand forecast based on XGBoost-LightGBM","authors":"Qianqian Zhu, Liu Yang, Yingnan Liu","doi":"10.1145/3609703.3609721","DOIUrl":"https://doi.org/10.1145/3609703.3609721","url":null,"abstract":"Vehicle spare parts demand forecasting is crucial for optimizing inventory and improving maintenance efficiency. This study aims to explore a vehicle spare parts demand forecasting method based on the fusion of XGBoost and LightGBM models to enhance prediction accuracy and precision. In this paper, we first collected a large amount of historical spare parts demand data and associated feature data, followed by data preprocessing and feature engineering. Then, we constructed individual machine learning models as well as the XGBoost-LightGBM fusion model, and performed parameter tuning and optimization using the Optuna framework. Experimental results demonstrate that both XGBoost and LightGBM models achieve favorable performance in spare parts demand forecasting, but the fusion of these two models further enhances prediction accuracy. The fusion model exhibits lower MAPE values compared to individual models on the test set, confirming its superiority and effectiveness. This method leverages the strengths of both models and improves prediction accuracy through weight fusion, offering practical significance in achieving accurate spare parts demand forecasting, optimizing inventory, and improving maintenance efficiency. Future research can explore alternative machine learning algorithms and feature engineering methods to further enhance the accuracy and reliability of vehicle spare parts forecasting.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123324728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yin-hua Wu, Mingquan Zhou, Shenglin Geng, Dan Zhang
{"title":"Building Segmentation from Remote Sensing Image via DWT Attention Networks","authors":"Yin-hua Wu, Mingquan Zhou, Shenglin Geng, Dan Zhang","doi":"10.1145/3609703.3609704","DOIUrl":"https://doi.org/10.1145/3609703.3609704","url":null,"abstract":"The attention mechanism has been widely used and achieved good results in many visual tasks. But the calculations of attention mechanism in vision tasks consume huge spaces and times, which is the obvious disadvantage of this method. In order to alleviate this problem, we use the DWT(Discrete Wavelet Transform) method to reduce the complexity of attention calculation. DWT can transform an N-dimensional vector into two vectors, one is the low-frequency component of N/2 dimension and the other is high-frequency component of N/2 dimension too. We only use the low-frequency to calculate the attention matrixes, which can reduce the complexity of matrix multiplication, then the time and space consumption of the network is reduced significantly. We also find that the building segmentation in the remote sensing image is different from the other scene segmentation, that the sizes and numbers of different classes of the targets in the general scene images are obvious. Despite all this, our method is still applicable for the targets with large numbers and sizes in general scene images, but not for the targets with small sizes and numbers, and this view is also verified by the subsequent experiments on different datasets. We apply our method on three typical networks (Danet, Swin and Segmenter), and carry out comprehensive experiments on the Cityscape dataset and three building segmentation datasets (Inria Aerial Dataset, Massachusetts Buildings Dataset and Chinese Style Architecture Dataset). The experiments show that, our method is more suitable for building segmentation and can reduce the complexity of the model calculation in building segmentation, and the Mean IoU of segmentation results is not reduced clearly, some even improved.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129285724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CapsNet-based drift detection in data stream mining","authors":"Borong Lin, Nanlin Jin","doi":"10.1145/3609703.3609724","DOIUrl":"https://doi.org/10.1145/3609703.3609724","url":null,"abstract":"For data streams, drift detection methods warn and detect the changes in patterns over time. For example, in smart manufacturing, many data streams are generated from sensors that monitor the real-time operation of manufacturing. Drift detection can be used to discover if and how the operation status changes. At present, there have been three main approaches in drift detection: error rate-based, distribution-based, and hypothesis-based. However, these approaches bear an impractical limitation: delays due to the demand for computational time. In a large-scale and high-speed data stream, a time-efficient detector is vital. To address this, this paper proposes a CapsNet-based drift detection algorithm (CapsNet-DDM). Our experimental results and comparative studies have found that CapsNet-DDM demonstrates a distinguishing advantage on computational time, with no compromise on accuracy, F1 score, and effective drift detection rates.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120992664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A study on the line loss index of a substation area based on cooperative games with multiple influencing factors","authors":"Linfeng Wu, Xiaowei Yang, Hao Yang, Zhenhui Zhu, Shunli Chen","doi":"10.1145/3609703.3609715","DOIUrl":"https://doi.org/10.1145/3609703.3609715","url":null,"abstract":"The line loss rate varies significantly among different substation areas due to diverse influencing factors. Consequently, a study is conducted to investigate the line loss index of a substation area by employing a cooperative game approach that considers multiple influencing factors. Firstly, utilizing the available fundamental data of the substation area, construct a substation area factor suitable for the calculation of \"one substation area, one index\". Subsequently, an initial low-voltage substation area line loss prediction model was constructed using Bi-LSTM. Finally, the weights of each influencing factor are calculated using a cooperative game strategy, and the attention mechanism is applied to Bi-LSTM. After the model is trained and optimized, the predicted value for the line loss index for each substation area is output. Experiments indicate that the algorithm can effectively enhances the accuracy of predicting the line loss index value in the substation area, and assist in customized and refined management of loss reduction in the low-voltage distribution substation area.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127563414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle Appearance Damage Detection Based on Mask R-CNN","authors":"Fei Meng, Qianqian Zhu, Xuening Wu","doi":"10.1145/3609703.3609709","DOIUrl":"https://doi.org/10.1145/3609703.3609709","url":null,"abstract":"Fei Meng Automotive Data of China Co.,Ltd., China Automotive Technology and Research Center Co.,Ltd., Tianjin, China mengfei@catarc.ac.cn Qianqian Zhu* Automotive Data of China Co.,Ltd., China Automotive Technology and Research Center Co.,Ltd., Tianjin, China zhuqianqian@catarc.ac.cn* Xuening Wu Automotive Data of China Co.,Ltd., China Automotive Technology and Research Center Co.,Ltd., Tianjin, China wuxuening@catarc.ac.cn","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"365 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120896007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}