{"title":"Fast QBE: Towards Real-Time Spoken Term Detection with Separable Model","authors":"Ziwei Tian, Shiqing Yang, Minqiang Xu","doi":"10.1109/MLISE57402.2022.00035","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00035","url":null,"abstract":"State-of-the-art spoken term detection or query-by-example networks depend on recurrent neural network (RNN), which extract fixed-dimensional vectors (embedded vectors) from both spoken query and the search content, and then calculate cosine distances over the vectors. However, these methods depend on time sequence, so it is a computational cost task, can not meet the requirements of both the query accuracy and search speed. In this work, we introduce a fast Spoken term detection system based on a separable model—RepVGG. Because of the trick of reparameterization, it has a faster speed in inference. Secondly, we use non maximum suppression and norm in the step of inference to improve it performance. Thirdly, we use multilanguage training to improve both accuracy and robustness of the system. Corresponding experiments are designed to verify these ideas. It show that proposed methods can import the GPU real-time factor (RTF) from 150 to 2300, and outperforms the state-of-the art method.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132462979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lin Wang, Tianyong Ao, Le Fu, Jian Liu, Yang Liu, Yingjie Zhou
{"title":"Design of a YOLO Model Accelerator Based on PYNQ Architecture","authors":"Lin Wang, Tianyong Ao, Le Fu, Jian Liu, Yang Liu, Yingjie Zhou","doi":"10.1109/MLISE57402.2022.00011","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00011","url":null,"abstract":"The application requirements of object detection models based on deep learning are very extensive. However, high computing power requirements often seriously restrict the application of these models on resource-constrained devices with high energy efficiency requirements. To address this problem, a YOLO model accelerator architecture is proposed based on PYNQ. Based on the FPGA hardware platform, the hardware accelerator is designed by making full use of pipeline, loop unrolling, data reordering and other methods to accelerate the computationally intensive units in the YOLOv2 model such as the convolution and pooling layers. In order to reduce the delay in the data transmission process, the multi-channel transmission architecture combined with the ping-pong buffer is designed, and block-by-block reading strategy is adopted to read the off-chip data. The proposed YOLO model accelerator has been implemented and verified on Xilinx PYNQ-z2 platform. The experimental results show that the system has high detection accuracy and far lower power consumption than CPU and GPU. It can also be deployed on mobile devices to detect the surrounding environment.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130989721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaotong Cui, Yanjun Wei, Tianping Li, Guanxing Li
{"title":"Image Segmentation Algorithm Based on Attention Mechanism and Jump Connection","authors":"Zhaotong Cui, Yanjun Wei, Tianping Li, Guanxing Li","doi":"10.1109/MLISE57402.2022.00058","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00058","url":null,"abstract":"With the development of deep learning, convolutional neural networks have become the mainstream of computer vision algorithms. In recent years, the biggest problem of applying convolutional neural networks to image segmentation is that they cannot achieve accurate segmentation of the last layer, also cause resolution loss when extracting features, and cannot meet the demand of different pixels requiring different context dependencies. To address these issues, we add an attention mechanism and a jump feature fusion method to deeplabv3+ so that features are extracted without severe feature loss and a broader range of contextual information can be encoded into local features. The feature map is further enriched by adding a module combining bilinear upsampling and deconvolution in the process of feature restoration. Compared to previous algorithms, the results of this algorithm are superior. A performance of 85.73% is achieved on PASCAL VOC2012 using the proposed model.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116094042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Anchor-Free Object Detection Method","authors":"YuHu Han, Tonghe Ding, Tianping Li, Meng Li","doi":"10.1109/MLISE57402.2022.00009","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00009","url":null,"abstract":"Object detection plays an important role in various industries. However, a fully convolutional one-stage detector (FCOS) has high computational cost and low detection accuracy. Therefore, in this paper, we fused the attention module (CBAM) with feature pyramid (FPN) to extract important feature information, suppress useless information and improve detection accuracy. Finally, we use inverted residual convolution block to replace the detection head of the original method, and the improved detection head reduces the calculation cost and amount of calculation. We use PASCAL VOC to train and evaluate our network. Experimental results show that, compared with the traditional method, the detection accuracy is improved by 1.2%, and the number of parameters is reduced by 4.2m.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123862282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Domain Adaptive Adversarial Training Method Based on Self-Supervised Learning","authors":"Chuqing Sun","doi":"10.1109/MLISE57402.2022.00070","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00070","url":null,"abstract":"Image classification technology based on neural network is an important task in computer vision, and the introduction of transfer learning can solve the problems of lack of data sets and long training time. To address this problem, this paper proposes a self-supervised domain-adaptive adversarial network approach. The algorithm uses the VGG network to extract image features, realizes the transfer learning of different image styles through domain adversarial training, and introduces a data augmentation model and self-supervised learning method based on pseudo-label to improve the accuracy of model classification. The experimental results show that the model can effectively improve the accuracy of image transfer learning of different styles in the image classification problem. When the number of pseudo-labels is 10, the classification effect is the best, and the accuracy rate is improved to 12.99%, which greatly saves training time and computing power while solving the problem of missing training data.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131387461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter optimization algorithm for quantum particle swarm-based i-vector identification systems","authors":"Guangqi Liu, Wushour Silamu","doi":"10.1109/MLISE57402.2022.00065","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00065","url":null,"abstract":"For the noise robustness problem in i-vector: Based on the theoretical principle of i-vector speaker recognition system, the extraction principle and scoring calculation method of i-vector and the process of channel compensation algorithm based on PLDA (Probabilistic Linear Discriminant Analysis) with PLDA model are studied. The matching principle is studied. A statistical averaging i-vector extraction algorithm based on speech fragmentation is proposed to extract more robust i-vector features by weakening the statistical parameters of bad speech fragments to improve the recognition performance of the system. After that, the i-vector system is designed to improve the recognition performance of the i-vector.l Then, a Quantum Particle Swarm Optimization is designed to optimize the parameters of the i-vector recognition system to avoid the degradation of the system performance caused by artificial empirical values. Experimental analysis shows that the proposed algorithm has improved performance over the traditional i-vector recognition algorithm, especially in the case of noise interference, and has better recognition performance","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127787471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing the Working Environment of Auto-routing and Obstacle-avoiding Robot Based on Simulink","authors":"Yanze Zhang","doi":"10.1109/mlise57402.2022.00056","DOIUrl":"https://doi.org/10.1109/mlise57402.2022.00056","url":null,"abstract":"Automatic sweeping robots gradually come into people's life. These robots can automatically find the way forward and complete the sweeping task at the same time. It is found that the floor sweeping robot will get stuck in a corner or cannot get out when it enters a blind angle. The main purpose of this work is to improve the action mode of the robot and its working environment. Using Simulink in MATLAB as the main body of the simulation, an environment is built to simulate the work of the auto-routing robot. Through randomly generated different mazes, the working status of the robot in different environments is simulated, the probability of Karton between the improved environment and the robot is compared, and the main factors affecting the robot's abnormal operation are analyzed. When the robot is able to move with a smaller turning radius and length of motion, a smoother and more stable gait track can be presented with a shorter path if the maze track has enough width and the wall is thick. By controlling these factors, people can directly and accurately optimize the working environment of the robot. This is one of the key operations to improve the working environment of this kind of self-seeking and obstacle-avoiding robot and to give them a suitable working environment.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115661229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improvement of super resolution reconstruction method for real text images","authors":"J. Zhang, Hong Qu","doi":"10.1109/MLISE57402.2022.00082","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00082","url":null,"abstract":"Super resolution refers to the process of restoring a low resolution image to a high resolution image. In recent years, researchers in the field of super-resolution are not satisfied with restoring artificially defined low-resolution images, and try to restore low-resolution images in natural scenes. For this situation, a real scene text super-resolution dataset TextZoom is proposed. It contains one-to-one mapping of low-resolution and high-resolution images captured by the camera, which is more realistic and challenging than manufactured data. A super-resolution network for TextZoom, which is called TSRN is also proposed. By adding channel attention and increasing the proportion of gradient loss function, the overall network pays more attention to the restoration of text and enhances the lines, and finally improves the recognition rate of medium and difficult graded text images in the TextZoom dataset after super-resolution preprocessing.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117130500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple solutions for a Class of p-Kirchhoff type equations","authors":"Beibei Wang","doi":"10.1109/mlise57402.2022.00024","DOIUrl":"https://doi.org/10.1109/mlise57402.2022.00024","url":null,"abstract":"In this paper, An improved Mountain Pass Theorem is used to study the existence of multiple solutions for the nonlinear p-kirchhoff Dirichlet boundary value problems under some natural conditions on f(x, v), and some known results are generalized.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115831860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Weighted Voting Classifier for Network Intrusion Detection","authors":"R. Zhang","doi":"10.1109/MLISE57402.2022.00076","DOIUrl":"https://doi.org/10.1109/MLISE57402.2022.00076","url":null,"abstract":"Network security is important for countries, companies, and other governments. Network intrusion detection becomes more and more critical for numerous applications. In network intrusion detection, ensembles are often used to improve the performance of single classifiers. However, how to assign weights for the different classifiers is a problem. Instead of using the simple majority voting method, multiple ways to assign global weights are introduced to achieve better performance. In this paper, a new way of dynamically updating weights while predicting is proposed and applied to the classification problem on the UNSW-2015 dataset. The result shows that the dynamic weighted voting classifier performs better than the fixed weighted voting and simple majority rule voting in general.","PeriodicalId":350291,"journal":{"name":"2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)","volume":"19 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115598857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}