{"title":"Comparative Analysis of Features In a Speech Emotion Recognition System using Convolutional Neural Networks","authors":"Prachii Kumar, K. S. Shushrutha","doi":"10.1109/ISPACS51563.2021.9651089","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651089","url":null,"abstract":"In the past decade, Speech Emotion Recognition (SER) in many spoken languages has become a field of growing interest. MFCCs (Mel Frequency Cepstrum Coefficients) are commonly utilized representations for audio classification, and are now becoming a prominent feature in SER systems. However, in the view of a performance analysis, there exists another feature named PCEN (Per Channel Energy Normalization) that has proven to outperform MFCCs in the context of speech. In order to compare the performances of the MFCC and PCEN, they have individually been used as inputs into a one dimensional Convolutional Neural Network (CNN). The samples from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) were utilized. Furthermore, the framework proposed in this paper obtains an accuracy of 85.3% for the configuration that utilizes PCEN, 77.4% for the configuration that uses only the MFCCs as inputs, and 78.1% that combines both.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128632938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Usman Ahmed, Chun-Wei Lin, Philippe Fournier-Viger, Chien-Fu Cheng
{"title":"Privacy-Preserving Periodic Frequent Pattern Model in AIoT Applications","authors":"Usman Ahmed, Chun-Wei Lin, Philippe Fournier-Viger, Chien-Fu Cheng","doi":"10.1109/ISPACS51563.2021.9651132","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651132","url":null,"abstract":"We begin with a sanitization strategy for concealing sensitive periodic frequent patterns in this study. The developed method employs the Term Frequency and Inverse Document Frequency (TF-IDF) to determine which transactions and objects should be sanitized based on user-defined sensitive periodic frequent patterns. Using the designed approach, it is possible to correctly and properly choose victim items in the transactional database for data sanitization.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116792489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Multiple Object Segmentation in Mask Coefficient Space","authors":"Yunmu Huang, Shih-Shinh Huang, Feng-Chia Chang","doi":"10.1109/ISPACS51563.2021.9651078","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651078","url":null,"abstract":"Recently, the exponential increase in video data makes video instance segmentation attracts significant attention in the field of computer vision. In this work, we propose a method for online multiple object segmentation. The proposed method describes each object by the mask coefficients with respect to the generated prototypes. Instead of tracking multiple objects in image/feature space, we address the segmentation and tracking issues directly in the mask coefficient space that is stable and discriminative for temporal matching. In the experiment, we validate the proposed method by using the DAVIS 2019 dataset.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117142737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Q-SegNet: Quantized deep convolutional neural network for image segmentation on FPGA","authors":"Afaroj Ahamad, Chi-Chia Sun, M. H. Nguyen, W. Kuo","doi":"10.1109/ISPACS51563.2021.9650929","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9650929","url":null,"abstract":"One of the important tasks in the area of computer vision is semantic segmentation. The implementation of a semantic segmentation system in an embedded platform is a fruitful idea. But due to the limitations of embedded ability, it becomes a tough task. In this article, we proposed a novel and practical architecture i.e. quantized deep convolutional neural network for image segmentation (Q-SegNet). This architecture will be implemented on an FPGA device, which allows reducing the parameter size of the original architecture. Hence the required power also reduces. Thus, this paper proposed a high performance deep learning processor unit (DPU) based accelerator for Semantic segmentation neural network. This research is quite suitable for robot vision in an embedded platform and the segmentation accuracy is up to 89.60% on average. Notably, the proposed faster architecture is ideal for low power embedded devices that need to solve the shortest path problem, path searching, and motion planning, in the ADAS and Robot.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115239116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SNR Based Beam Switching in Millimeter Wave Vehicular Communications","authors":"Yuanfeng Peng, Miin-Jong Hao","doi":"10.1109/ISPACS51563.2021.9651021","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651021","url":null,"abstract":"The demands of high data rate and mobility in millimeter wave (mmWave) vehicular systems require precise beam alignments. Having the best aligned beams between transmitters and receivers guarantees the best link quality to maximum system performance. The performance highly depends on the signal-to-noise ratio (SNR) with a proper beam selection. In this paper, we propose a scheme which uses the signal-to-interference-noise ratio (SINR) as an indicator to evaluate the beam switching threshold. This scheme is a practical and implementable in the current realistic mobile wireless networks, and the simulation results show that the various misalignment ranges have no obvious degradation on the overall performance.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"920 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116185916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CNN-type myocardial infarction prediction based on cardiac cycle determination","authors":"Tsubasa Kanai, N. Tanabe, Y. Miyagi, J. Aoyama","doi":"10.1109/ISPACS51563.2021.9651000","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651000","url":null,"abstract":"This paper proposes the cardiac infarction prediction based on CNN-type myocardial infarction prediction using evaluate of myocardial dynamics. The proposed algorithm (i) remove noise using masking image for the ultrasound images into frame-by-frame, (ii) determine the cardiac cycle using optical flow analysis [1] for the characteristic dynamics of each cardiac anatomical region, and then, (iii) predict the myocardial infarction using CNN from time-frequency analysis of myocardial dynamics.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125592792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Underwater Object Recognition for Offshore Wind Farm Environmental Impact Assessment","authors":"Chun-Chih Lo, Yi-Ray Tseng, Chien-Chou Shih, Shurong Guo, Chin-Shiuh Shieh, Mong-Hong Horng","doi":"10.1109/ISPACS51563.2021.9651121","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651121","url":null,"abstract":"In recent years, Taiwan has actively built lots of offshore wind turbines in the western area of Taiwan due to its renewable energy policies. However, the construction of these turbines may potentially create a variety of issues for the marine ecosystems. Thus, it is necessary to evaluate each potential site for offshore wind turbines to decrease the impacts on the ecosystems. To achieve this, this paper proposes an underwater environmental monitoring architecture, using side-scan sonar imagery combining image noise filtering and YOLOv3 real-time object recognition technology to assist with the selection of the potential site of wind farms. The experimental results show this approach only needs 0.0021 seconds to process each sonar image with an average accuracy of 72.3% in the detection of fish schools.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127828440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Color Conversion Formulae between RGB Color Space and HSI Color Space for Color Image Processing","authors":"Taichi Oinosho, Minako Kameyama, A. Taguchi","doi":"10.1109/ISPACS51563.2021.9651118","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651118","url":null,"abstract":"The color space expressed by the three attributes of human color, hue, saturation, and lightness, is the HSI color space. In particular, in color image processing, preserving hue is required, therefore the HSI color space is used. Since color images are acquired and displayed in RGB color space, conversion with RGB color space is required for processing in HSI color space. The general HSI color space is defined by Gonzalez and Woods, but the color gamut is very different from the RGB color space. The larger the intensity value, the more inappropriate the saturation value. Thus, when the intensity and saturation are processed in this HSI color space, it is often the case that the processing result is located in outside the RGB color space or even if the processing result is in the RGB color space, it may not be an appropriate value. In this paper, RGB color space and CMY color space are used properly according to the value of the input signal, and processing is performed in the HSI color space converted from each space. After the processing in the HSI color space, we convert to the RGB color space by using a proposed method in this paper, in order to obtain the same processing result as the result of processing in the ideal HSI color space.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"101-B 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125792172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lie Yang, T. Su, Jui-Chuan Cheng, Yu-Chen Siao, Chien-Erh Weng
{"title":"The Decoding Method Study of DSC Signal","authors":"Lie Yang, T. Su, Jui-Chuan Cheng, Yu-Chen Siao, Chien-Erh Weng","doi":"10.1109/ISPACS51563.2021.9651046","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651046","url":null,"abstract":"As the progress of information technology and intelligence gathering technology, marine communication services has already become an important platform to integrate ocean measure, safety of shipping and marine weather information. The Digital Selective Calling (DSC) under Global Marine Distress and Safety System (GMDSS), the existing software and equipment need to rely on foreign manufacturers. The maintenance cost is high, and the existing display interface of the software can only indicate the information of the vessels in distress by text message. In order to improve above-mentioned shortcoming, the aim of this paper is to integrate Automatic Identification System (AIS) and DSC to study the decoding method of DSC signal. The study can be divided into three phases. In the phase one, we study the registers of EV8850 microcontroller with PE0003 microcontroller to carry out the acquisition and decoding of the DSC signal. In the phase two, the format decoded DSC signals are classified as we need and uploads to the DSC database. In the final phase, finished AIS and DSC real-time information system to show the decoded results.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125955116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI-based Stereoview to Multiview Generation by Using Deformable Convolution","authors":"Wei Hong, J. Yang","doi":"10.1109/ISPACS51563.2021.9651052","DOIUrl":"https://doi.org/10.1109/ISPACS51563.2021.9651052","url":null,"abstract":"Three dimension (3D) movies are the main trend in the film industry. In the current stereoview, the audiences require wearing 3D glasses to perceive 3D visualization. The 3D movies with stereoview with only left and right views, which cannot be directly displayed in the naked-eyes 3D displays. To directly support naked-eyes 3D displays, which require multiple views, we propose a deep learning based stereo to multiview conversion system by using the deformable convolution to synthesize additional virtual views. For immersive 3D multimedia services, we hope we can improve the quality of user 3D experiences without wearing 3D glasses without the needs of depth estimation and depth image based rendering functions.","PeriodicalId":359822,"journal":{"name":"2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124152851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}