Naoto Umezaki, Takumi Okubo, Hideyuki Watanabe, S. Katagiri, M. Ohsaki
{"title":"Minimum Classification Error Training with Speech Synthesis-Based Regularization for Speech Recognition","authors":"Naoto Umezaki, Takumi Okubo, Hideyuki Watanabe, S. Katagiri, M. Ohsaki","doi":"10.1145/3372806.3372819","DOIUrl":"https://doi.org/10.1145/3372806.3372819","url":null,"abstract":"To increase the utility of Regularization, which is a common framework for avoiding the underestimation of ideal Bayes error, for speech recognizer training, we propose a new classifier training concept that incorporates a regularization term that represents the speech synthesis ability of classifier parameters. To implement our new concept, we first introduce a speech recognizer that embeds Line Spectral Pairs-Conjugate Structure-Algebraic Code Excited Linear Prediction (LSP-CS-ACELP) in a Multi-Prototype State-Transition-Model (MP-STM) classifier, define a regularization term that represents the speech synthesis ability by the distance between a training sample and its nearest MP-STM word model, and formalize a new Minimum Classification Error (MCE) training method for jointly minimizing a conventional smooth classification error count loss and the newly defined regularization term. We evaluated the proposed training method in an isolated-word, closed-vocabulary, and speaker-independent speech recognition task whose Bayes error is estimated to be about 20% and found that our method successfully produced an estimate of Bayes error (about 18.4%) with a single training run over a training dataset without such data resampling as Cross-Validation or the assumptions of sample distribution. Moreover, we investigated the quality of the synthesized speech using LSP parameters derived from the trained prototypes and found that the quality of the Bayes error estimation is clearly supported by the speech synthesis ability preserved in the training.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125688174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-source Radar Data Fusion via Support Vector Regression","authors":"Zhanchun Gao, Y. Xiang","doi":"10.1145/3372806.3372810","DOIUrl":"https://doi.org/10.1145/3372806.3372810","url":null,"abstract":"Since the measurement error of surveillance sensors such as radar differs each other in the detection of the same target, it's necessary to fuse the multi-source radar data to estimate the true location of target and reduce the measurement error of radar. The key is to establish nonlinear regression model since the uncertainty of measurement error. In this paper, the Support Vector Regression(SVR) methodology was adopted to estimate the true location of target based upon the measurement results of multi-source radar. We uniquely identify a region by a sequence of radar id which means a target can be detected in this area by radars with id listed in the sequence. Different regression model was established in different region which are independent of each other. Since the coordinate system used by radar data and ADSB data is different, we mapped all the data into the same two-dimensional Cartesian coordinate system. In the same region, two regression models were established to estimate the values of aircraft on the x-axis and the y-axis. After we predict the x and y coordinates of the target, we convert the coordinates back to the WGS84 format.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114139760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongyang Ye, Shangping Zhong, Jiahao Zhuang, Li Chen
{"title":"Method for Removing Motion Blur from Images of Harmful Biological Organisms in Power Places Based on Improved Cyclegan","authors":"Dongyang Ye, Shangping Zhong, Jiahao Zhuang, Li Chen","doi":"10.1145/3372806.3372820","DOIUrl":"https://doi.org/10.1145/3372806.3372820","url":null,"abstract":"Nowadays, the automatic detection of harmful organisms in power places has attracted attention due to the extensive unattended way of power places. However, surveillance pictures are prone to motion blurring and harmful organisms cannot be effectively detected due to their frequent and fast movements in power places. On the basis of the improved Cycle-Consistent Adversarial Networks (CycleGAN) model, we propose a method for removing motion blur from the images of harmful biological organisms in power places. This method does not require paired blurred and real sharp images for training, which is consistent with actual requirements. In addition, our method improves the classical CycleGAN model by combining cycle consistency and perceptual loss to enhance the detail authenticity of image texture restoration and improve the detection accuracy. The model uses Wasserstein GAN with gradient penalty (WGAN-GP) as a loss function to train the depth model. Given the existence of the GAN itself, the entire real image distribution space is difficult to fill with the generated image distribution space. Experimental results show that the proposed method effectively improves the detection accuracy of harmful organisms in power places.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131465499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tongnan Xia, Mengyao Shu, Hongtao Fan, Lei Ma, Yaojie Sun
{"title":"The Development and Trend of ECG Diagnosis Assisted by Artificial Intelligence","authors":"Tongnan Xia, Mengyao Shu, Hongtao Fan, Lei Ma, Yaojie Sun","doi":"10.1145/3372806.3372807","DOIUrl":"https://doi.org/10.1145/3372806.3372807","url":null,"abstract":"Due to the low accuracy and efficiency of traditional manual and existing automated interpretation of ECG, misdiagnosis and missed diagnosis are easy to occur. Studies have shown that, artificial intelligence technology is the direction of ECG diagnosis in the future. The wide application of artificial intelligence in ECG diagnostic system will effectively promote the rapid development of electrocardiography and improve the level of clinical prevention, early warning and treatment as well as prognosis evaluation. Based on the research situation of our research group, we summarized and introduced the research progress of using artificial intelligence technology to assist ECG diagnosis at home and abroad in this paper.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"253 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133646700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenxing Luo, Lulu Zhao, Wanyong Tian, Dan Yang, Yiyuan Chen, Jiabin Yu, Jianjun Li
{"title":"Data Link Modeling and Simulation Based on DEVS","authors":"Zhenxing Luo, Lulu Zhao, Wanyong Tian, Dan Yang, Yiyuan Chen, Jiabin Yu, Jianjun Li","doi":"10.1145/3372806.3374911","DOIUrl":"https://doi.org/10.1145/3372806.3374911","url":null,"abstract":"The Discrete Event System (DEVS)[1] Specification provides a reference standard for the model design and simulation development of complex discrete event state system. It designs a formal mechanism to describe discrete event state, which is composed of a set of strictly abstract mathematical symbols, and provides a strict description mechanism and execution logic for the modeling and Simulation of discrete event state system, it ensures the normalization, reusability and simulation operation ability of the model. As a special link system, Data Link[2] is different from general communication system. It is mainly used between different military platforms to ensure information sharing. By linking sensors, command and control system and weapon platform according to the specified message format and communication protocol, the information system that can automatically transmit the formatted data of battlefield situation, command and guidance, tactical coordination, weapon control, etc., in real time can form a close and efficient tactical link relationship between different combat platforms. In this paper, DEVS is used to simulate the Reporting Responsibility for Air, Surface, and Land tracks in MIL-STD-6016B.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115243282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masahiro Senda, David Ha, Hideyuki Watanabe, S. Katagiri, M. Ohsaki
{"title":"Maximum Bayes Boundary-Ness Training For Pattern Classification","authors":"Masahiro Senda, David Ha, Hideyuki Watanabe, S. Katagiri, M. Ohsaki","doi":"10.1145/3372806.3372817","DOIUrl":"https://doi.org/10.1145/3372806.3372817","url":null,"abstract":"The ultimate goal of pattern classifier parameter training is to achieve its optimal status (value) that produces Bayes error or a corresponding Bayes boundary. To realize this goal without unrealistically long training repetitions and strict parameter assumptions, the Bayes Boundary-ness-based Selection (BBS) method was recently proposed and its effectiveness was clearly demonstrated. However, the BBS method remains cumbersome because it consists of two stages: the first generates many candidate sets of trained parameters by carefully controlling the training hyperparameters so that those candidate sets can include the optimal target parameter set; the second stage selects an optimal set from candidate sets. To resolve the BBS method's burden, we propose a new one-stage training method that directly optimizes a given classifier parameter set by maximizing its Bayes boundary-ness or increasing its accuracy during Bayes error estimation. We experimentally evaluate our proposed method in terms of its accuracy of Bayes error estimation over four synthetic or real-life datasets. Our experimental results clearly show that it successfully overcomes the drawbacks of the preceding BBS method and directly creates optimal classifier parameter status without generating too many candidate parameter sets.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122472148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-scale Fusion and Channel Weighted CNN for Acoustic Scene Classification","authors":"Liping Yang, Xinxing Chen, Lianjie Tao, Xiaohua Gu","doi":"10.1145/3372806.3372809","DOIUrl":"https://doi.org/10.1145/3372806.3372809","url":null,"abstract":"Ensemble semantic features are useful for acoustic scene classification. In this paper, we proposed a multi-scale fusion and channel weighted CNN framework. The framework consists of two stages: the multi-scale feature fusion and channel weighting stages. The multi-scale feature fusion stage extracts hierarchy semantic feature maps using a CNN with simplified Xception architecture and then integrates multi-scale semantic features through a top-down pathway. The channel weighting stage squeezes feature maps into a channel descriptor and then transforms it into a set of channel weighting factors to reinforce the importance of each channel for acoustic scene classification. Experimental results on DCASE2018 acoustic scene classification subtask A and subtask B demonstrate the performances of the proposed framework.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123491117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation","authors":"Ming Liu, Caiming Zhang, Zhao Zhang","doi":"10.1145/3372806.3372811","DOIUrl":"https://doi.org/10.1145/3372806.3372811","url":null,"abstract":"Although Convolutional Neural Networks are effective visual models that generate hierarchies of features, there still exist some shortcomings in the application of Deep Convolutional Neural Networks to semantic image segmentation. In this work, our algorithm incorporates multi-scale atrous convolution, attention model and Conditional Random Fields to tackle this problem. Firstly, our method replaces deconvolutional layers with atrous convolutional layers to avoid reducing feature resolution when the Deep Convolutional Neural Networks is employed in a fully convolutional fashion. Secondly, multi-scale architecture and attention model are used to extract the existence of features at multiple scales. Thirdly, we use Conditional Random Fields to prevent the built-in invariance of Deep Convolutional Neural Networks reducing localization accuracy. Moreover, our network completely integrates Conditional Random Fields modelling with Deep Convolutional Neural Networks, making it possible to train the deep network end-to-end. In this paper, our method is used to the matters of semantic image segmentation and is demonstrated the effectiveness of our model with experiments on PASCAL VOC 2012.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121035952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuhan Chen, Shangping Zhong, Kaizhi Chen, Shoulong Chen, Song Zheng
{"title":"Automated Detection of Sewer Pipe Defects Based on Cost-Sensitive Convolutional Neural Network","authors":"Yuhan Chen, Shangping Zhong, Kaizhi Chen, Shoulong Chen, Song Zheng","doi":"10.1145/3372806.3372816","DOIUrl":"https://doi.org/10.1145/3372806.3372816","url":null,"abstract":"Regular inspection and repair of drainage pipes is an important part of urban construction. Currently, many classification methods have been used for defect diagnosis using images inside pipelines. However, most of these classification models train the classifier with the goal of maximizing accuracy without considering the unequal error classification cost in defect diagnosis. In this study, the authors analyze the characteristics of sewer pipeline defect detection and design an automated detection framework based on the cost-sensitive deep convolutional neural network (CNN). The method makes the CNN network cost sensitive by introducing learning theories at the structural and loss levels of the network. To minimize misclassification costs, the authors propose a new auxiliary loss function Cost-Mean Loss, which allows the model to obtain the original parameters of the network to maximize the accuracy and improve the performance of the model by minimizing total misclassification costs in the learning process. Theoretical analysis shows that the new auxiliary loss function can be applied to the classification task to optimize the expected value of misclassification costs. The inspection images collected from multiple drainage pipes were used to train and test the network. Results show that after the cost-sensitive strategy was added, the defect detection rate decreased from 2.1% to 0.45%. Moreover, the model with Cost-Mean Loss has better performance than the original model.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117077670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enwei Zhang, Hongtu Zhou, Yongchao Ding, Junqiao Zhao, Chen Ye
{"title":"Learning How to Avoiding Obstacles for End-to-End Driving with Conditional Imitation Learning","authors":"Enwei Zhang, Hongtu Zhou, Yongchao Ding, Junqiao Zhao, Chen Ye","doi":"10.1145/3372806.3372808","DOIUrl":"https://doi.org/10.1145/3372806.3372808","url":null,"abstract":"Obstacle avoiding is one of the most complex tasks for autonomous driving systems, which was also ignored by many cutting-edge end-to-end learning-based methods. The difficulties stem from the integrated process of detection and interpretation of environment and obstacles and generation of proper behaviors. We make the use of CARLA, a simulator for autonomous driving research, and collect massive human drivers' reactions to obstacles on road subjecting to given driving commands, i.e. follow, go straight, turn left and turn right for about 6 hours. A behavior-Cloning neural network architecture is proposed with the modified loss that enlarge the effects of errors for steer, which indicates the benefit to high an accuracy. We found the data augmentation of the image is crucial to the training of the proposed network. And a reasonable limit allows avoiding unexpected stop. The experiments demonstrate 3 obstacle avoidance cases: for the same type as the training dataset, other automobile and two-wheeled vehicles. Finally, the CARLA benchmark is also tested.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132027934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}