{"title":"A Novel Speech-Driven Lip-Sync Model with CNN and LSTM","authors":"Xiaohong Li, Xiang Wang, Kai Wang, Shiguo Lian","doi":"10.1109/CISP-BMEI53629.2021.9624360","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624360","url":null,"abstract":"Generating synchronized and natural lip movement with speech is one of the most important tasks in creating realistic virtual characters. In this paper, we present a combined deep neural network of one-dimensional convolutions and LSTM to generate vertex displacement of a 3D template face model from variable-length speech input. The motion of the lower part of the face, which is represented by the vertex movement of 3D lip shapes, is consistent with the input speech. In order to enhance the robustness of the network to different sound signals, we adapt a trained speech recognition model to extract speech feature, and a velocity loss term is adopted to reduce the jitter of generated facial animation. We recorded a series of videos of a Chinese adult speaking Mandarin and created a new speech-animation dataset to compensate the lack of such public data. Qualitative and quantitative evaluations indicate that our model is able to generate smooth and natural lip movements synchronized with speech.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115808021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A simple illumination map estimation based on Retinex model for low-light image enhancement","authors":"Shiqiang Tang, Changli Li, X. Pan","doi":"10.1109/CISP-BMEI53629.2021.9624323","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624323","url":null,"abstract":"This paper proposes a effective illumination map estimation based on Retinex theory for low illuminance image enhancement. Firstly, initial illumination map is calculated by finding the largest element value in the b, g and r channels. Secondly, we adopt anisotropic filter operations to process initial illumination map. Then, we propose an adaptive gamma correction to process it to make the illuminance map more accurate. Finally, we adopt unsharp masking to enhance details to get our result. Objective and subjective evaluation illustrate the superiority of our proposed algorithm.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124323754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CCANet: Exploiting Pixel-wise Semantics for Irregular Scene Text Spotting","authors":"Shanbo Xu, Chen Chen, Silong Peng, Xiyuan Hu","doi":"10.1109/CISP-BMEI53629.2021.9624403","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624403","url":null,"abstract":"Despite the progress in regular scene text spotting, how to detect and recognize irregular text with efficiency and accuracy remains a challenging task. In this work, we propose a novel Corner and Character Assisted Network (CCANet) which exploits pixel-wise semantics to learn explicit text corner and character center positions with low computational cost. Concretely, in the detection stage, we develop a pixel-level Corner Rectification Branch to refine the inaccurately regressed text corners; in the recognition stage, we design another pixel-level Character Enhancement Branch which generates a Gaussian-like character center heatmap to provide attention guidance for the decoding process. To overcome the reliance of character-level annotations, we adopt an iterative approach to generate pseudo-GT label for the character heatmap, which regards the attention peak position of the attention-based recognizer as the true character center. The extensive experiments conducted on two irregular text benchmarks, Total-Text and CTW1500, demonstrate that the proposed CCANet achieves competitive and even new state-of-the-art performance.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmentation of Intracranial Aneurysm Based on U-Net and BiConvGRU","authors":"Tao Hu, Jinhua Yu, Heng Yang, W. Ni","doi":"10.1109/CISP-BMEI53629.2021.9624354","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624354","url":null,"abstract":"Intracranial aneurysm (IA) cause a great risk to the health of patients. Digital subtraction angiography (DSA) is often used to diagnose IA. Early diagnosis and treatment of unruptured IA can effectively reduce the incidence of subarachnoid hemorrhage (SAH). In this paper, we proposed and evaluated a neural network structure for aneurysm segmentation to help doctors contour aneurysms from DSA sequences during aneurysm treatment. The network is based on the U-Net structure that often used in medical image segmentation. Bidirectional convolutional gated recurrent unit (BiConvGRU) module was added to the network. The module can captures the sequence changes between DSA images. In addition, the optical flow images corresponding to DSA images can be put into the network to extract the motion information of DSA. In this way, the network can obtain the spatial information, temporal information and motion information from DSA images. The experimental results showed that the dice coefficient score was 76.24% and the sensitivity was 92.82%.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127050973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bimodal Emotion Recognition using Kernel Canonical Correlation Analysis and Multiple Kernel Learning","authors":"Jingjie Yan, Weigen Qiu","doi":"10.1109/CISP-BMEI53629.2021.9624428","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624428","url":null,"abstract":"Bimodal emotion recognition on account of kernel canonical correlation analysis (KCCA) and multiple kernel learning (MKL) is investigated and utilized to discover the befitting and effectual fusion pattern with respect to facial expression channel and body gesture channel in the form of video data. Firstly, to relieve calculated quantity of the posterior fusion and classification procedure, the two groups of quondam facial expression and body gesture video data are switched to be indicated as the form of lower dimensional histogram spatio-temporal emotion vectors respectively by Dollar's spatio-temporal feature. Then, KCCA-MKL in the form of multiple kernels is adopted to portray the nonlinear character of facial expression and body gesture video data, and simultaneously to search two modalities' conjunct nonlinear correlative structures by considering the disadvantage of the signal kernel used in KCCA. The rudimentary idea of the KCCA-MKL method is using multiple kernels with the combination of gaussian kernel and $chi^{2}$ kernel to substitute for the signal kernel in KCCA. In experiment step, some types of the combination of the gaussian kernel and the $chi^{2}$ kernel are implemented in KCCA-MKL. The test results display that the classification accuracy of the KCCA-MKL approach is 56.91% using the KNN classifier, and is better than two unimodal methods and signal kernel method. Consequently, KCCA-MKL is more unfailing and efficient for bimodal emotion recognition.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127096928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Image Super-Resolution Reconstruction Method Based On LapSRN","authors":"Lei Kong, L. Jiao, Feng Jia, Kai Sun","doi":"10.1109/CISP-BMEI53629.2021.9624332","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624332","url":null,"abstract":"With the gradual maturity of the traditional static image recognition field, super-resolution reconstruction based on deep neural networks is a research hotspot and difficulty in the field of computer vision. In particular, most single-frame image super-resolution methods have problems such as loss of high-frequency information, noise introduced in the up-sampling process, and difficulty in determining the interdependence between each channel of the feature map when reconstructing the predicted image. In order to solve the above problems, we introduce back projection mechanism into the LapSRN network in this paper. By introducing the back projection mechanism effectively improved the consistency between the extracted image feature data and the target feature data feature, and thereby improved the reconstructed image parameters. Experiments show that the improved network can achieve better performance than LapSRN.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130564217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bingxiao Mei, R. Han, Xiongwei Jiang, Yue Wang, Decai Yin
{"title":"Failure Detection Of Infrared Thermal Imaging Power Equipment Based On Improved DenseNet","authors":"Bingxiao Mei, R. Han, Xiongwei Jiang, Yue Wang, Decai Yin","doi":"10.1109/CISP-BMEI53629.2021.9624227","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624227","url":null,"abstract":"Timely maintenance of the power equipment is the key to ensure the normal operation of the transmission equipment. In the substation scenario, a thermal infrared image detection method is proposed for target detection to detect potential faults in advance. The proposed method replaces the backbone network of Faster RCNN with DenseNet, to extract richer features, and in order to reduce the number of parameters of the backbone network, replaces standard convolution with learnable group convolution. To alleviate the problem of feature loss in packet convolution, the SFR structure is added to activate the features and improve the feature utilization. In order to reduce the complexity of the network and reasonably reduce the number of convolutions, we obtain a better lightweight model, and improve the NMS algorithm for the problem of regional overlap detection omission. Experiments show that the algorithm used has higher accuracy than YOLO and SSD, and the improved model not only reduces the network complexity, but also improves certain performance, and the final detection accuracy is 95.8%, which can be well applied to thermal infrared image detection","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130851834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Khan, L. Hadjileontiadis, D. Cornforth, Jon Drummond, H. F. Jelinek
{"title":"The Effectiveness of Point-of-Care Testing with Intervention in Psychopathology: A Pilot Study","authors":"Mohammed Khan, L. Hadjileontiadis, D. Cornforth, Jon Drummond, H. F. Jelinek","doi":"10.1109/CISP-BMEI53629.2021.9624223","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624223","url":null,"abstract":"Point-of-Care Testing (POCT) have mainly addressed biochemical systems. This paper presents a POCT based on recording physiological data and the options for intervention. Smartphone or tablets provide an ever-increasing number of applications for measuring diverse biochemical as well as physiological variables. An important adjunct to this is that the POCT App should provide a means of intervention. Previous work on assessing the efficacy of the biofeedback using HeartMath device has mainly concentrated on the effect of heart rate response, measured as heart rate variability (HRV) with the aim of improving anxiety, depression and immune response. This study investigated the effect of diaphragmatic paced breathing (6 breaths/min) or a serious game-based balloon-game for guiding biofeedback compared to normal breathing on electroencephalograph (EEG) signal complexity. Signal characteristics were analyzed following pre-processing and using the Higuchi Fractal Dimension (HFD) from the EEG directly, HFD obtained after applying Hilbert-Huang Transform (HHT) and Sample Entropy (SE) from EEG directly. Six subjects participated in the repeated measures pilot study. EEG was recorded using the Thought Technology device with the scalp electrode located at Cz prior, during and after HeartMath biofeedback training. Using all three complexity measures, most or all participants showed the lowest signal complexity during paced breathing regardless of analysis method. Only HFD with HHT showed a significant statistical difference (p<0.05) between the three conditions when using a Friedman repeated measures test. The findings suggest that biofeedback may be efficacious for POCT in psychopathology to reduce complexity of EEG which is often higher in patients with anxiety, depression or schizophrenia.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130973733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lin Yingxiang, Li Xiaohui, Liu Yun, C. Huilin, Yan Jianjun
{"title":"Design and Implementation of Multi-tank Wireless Control Automatic Cupping Device","authors":"Lin Yingxiang, Li Xiaohui, Liu Yun, C. Huilin, Yan Jianjun","doi":"10.1109/CISP-BMEI53629.2021.9624413","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624413","url":null,"abstract":"Traditional cupping operation pressure, tempe-rature, time and many other factors have great uncertainty, and the existing automatic cupping device has a single function and only supports a single tank. So in this paper we designed an intelligent multi-tank automatic cupping device, which can achieve automatic suction, pressure control, timing, wireless control and other functions. This design is mainly based on the automatic vacuum cupping device, realizes the automatic control of cupping pressure of 2–4 tanks at the same time through Arduino UNO microcontroller, and realizes the interactive communication between cupping device and mobile APP through HC-05 Bluetooth. The cupping parameters can be set on the APP to control cupping. This design combines the characteristics of traditional Chinese medicine cupping, modern automation technology and wireless communication technology to make cupping operation more convenient and more in line with the market demand of home cupping.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131684828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Portable in vivo measurement of apple sugar content based on mobile phone","authors":"Fanli Lin, Yishu Li","doi":"10.1109/CISP-BMEI53629.2021.9624327","DOIUrl":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624327","url":null,"abstract":"With the development of science and technology, people's living standards continue to improve. Sugar content has been widely studied as an important index to measure fruit quality, but at present, most sugar content detection needs expensive equipment or accessories, which is difficult to enter daily life. In this paper, we propose a convenient scheme for detecting apple sugar degree based on multispectral and machine learning. With the mobile phone screen as the main light source, the front camera captures Apple pictures, and changes the color of the mobile phone screen to change the wavelength of the light source. By photographing different surfaces of apples under different wavelengths of visible light, obtain pictures of the same apple with different spectra, sort out the data, make the data set and train the machine learning network model, deploy the trained network into the app, and then predict the sugar value of apples, so as to achieve the purpose of rapid and nondestructive detection of apple sugar.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125497428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}