2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献

筛选
英文 中文
Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network 基于时频掩模的卷积生成对抗网络语音增强
Neil Shah, H. Patil, Meet H. Soni
{"title":"Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network","authors":"Neil Shah, H. Patil, Meet H. Soni","doi":"10.23919/APSIPA.2018.8659692","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659692","url":null,"abstract":"Speech Enhancement (SE) system deals with improving the perceptual quality and preserving the speech intelligibility of the noisy mixture. The Time-Frequency (T-F) masking-based SE using the supervised learning algorithm, such as a Deep Neural Network (DNN), has outperformed the traditional SE techniques. However, the notable difference observed between the oracle mask and the predicted mask, motivates us to explore different deep learning architectures. In this paper, we propose to use a Convolutional Neural Network (CNN)-based Generative Adversarial Network (GAN) for inherent mask estimation. GAN takes an advantage of the adversarial optimization, an alternative to the other Maximum Likelihood (ML) optimization-based architectures. We also show the need for supervised T-F mask estimation for effective noise suppression. Experimental results demonstrate that the proposed T-F mask-based SE significantly outperforms the recently proposed end-to-end SEGAN and a GAN-based Pix2Pix architecture. The performance evaluation in terms of both the predicted mask and the objective measures, dictates the improvement in the speech quality, while simultaneously reducing the speech distortion observed in the noisy mixture.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130724429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Block Tensor Train Decomposition for Missing Value Imputation 缺失值输入的块张量列分解
Namgil Lee
{"title":"Block Tensor Train Decomposition for Missing Value Imputation","authors":"Namgil Lee","doi":"10.23919/APSIPA.2018.8659560","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659560","url":null,"abstract":"We propose a new method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (TT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the soft-thresholded singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of matrices is performed based on a low-rank block TT decomposition for large scale data matrices with a low-rank tensor structure. Experimental results on simulated data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133519781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Comparative Effect of Snowfall, Accumulation, and Density on Speech Intelligibility 降雪量、累积量和密度对语音清晰度的比较效应
Shuto Shibata, K. Kondo
{"title":"On the Comparative Effect of Snowfall, Accumulation, and Density on Speech Intelligibility","authors":"Shuto Shibata, K. Kondo","doi":"10.23919/APSIPA.2018.8659782","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659782","url":null,"abstract":"Sound is known to be altered in some manner by the acoustic characteristics of snow. However, the specific characteristics of snow, which actually affects the acoustical transfer characteristics, are not clearly understood. This transfer characteristics will be crucial in disaster prevention radio broadcasting systems that warn citizens working outdoors of potential natural disasters during the winter in regions with heavy snow. These systems use extremely high-output horn speakers to convey the warning messages to a large area. Accordingly, the purpose of this research is to clarify how the speech intelligibility will be influenced by the amount of snowfall, its accumulation, and the snow density. In this research, impulse response measurement outdoors is actually carried out during snowfall. We measured and compiled the transfer characteristics under several snow conditions, convolved these with test speech in order to simulate the transmitted speech quality during snow. We conducted a Japanese speech intelligibility test using these speech samples, and clarify the effect of each snow quality measure using multivariate analysis. As a result, it was found that although there is some influence of the amount of snowfall and density, the influence of the amount of snowfall becomes dominant as the distance between the loudspeaker and the listener (microphone) becomes large.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133549242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-View and Multi-Modal Action Recognition with Learned Fusion 基于学习融合的多视角多模态动作识别
Sandy Ardianto, H. Hang
{"title":"Multi-View and Multi-Modal Action Recognition with Learned Fusion","authors":"Sandy Ardianto, H. Hang","doi":"10.23919/APSIPA.2018.8659539","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659539","url":null,"abstract":"In this paper, we study multi-modal and multi-view action recognition system based on the deep-learning techniques. We extended the Temporal Segment Network with additional data fusion stage to combine information from different sources. In this research, we use multiple types of information from different modality such as RGB, depth, infrared data to detect predefined human actions. We tested various combinations of these data sources to examine their impact on the final detection accuracy. We designed 3 information fusion methods to generate the final decision. The most interested one is the Learned Fusion Net designed by us. It turns out the Learned Fusion structure has the best results but requires more training.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133302894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Cocoa bean quality assessment using closed range hyperspectral images 用近距离高光谱图像评价可可豆质量
Oswaldo Bayona, Daniel Ochoa, Ronald Criollo, J. Cevallos-Cevallos, Wenzi Liao
{"title":"Cocoa bean quality assessment using closed range hyperspectral images","authors":"Oswaldo Bayona, Daniel Ochoa, Ronald Criollo, J. Cevallos-Cevallos, Wenzi Liao","doi":"10.23919/APSIPA.2018.8659490","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659490","url":null,"abstract":"Farmers mix high and low quality cocoa beans to increase their income at the expense of chocolate flavor. We use closed range hyperspectral images to recognize two common varieties of cocoa beans at various fermentation stages. Several image calibration issues are addressed in this paper to reduce the effect of the bean's shape in the reflectance image estimation and specular patches on the bean's surface. Fusion and feature extraction techniques were exploited for bean classification. From our experimental results, we noticed that bean's biochemical processes during fermentation of each bean type influences their spectral signatures enabling an increasingly better discrimination. We found that spectral indexes related to anthocyanin reflectance index yield a high discriminant rate, particularly at later fermentation stages. These findings suggest that bean classification is possible and could be adopted as the standard method for fast bean quality assessment.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132088500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Block-Permutation-Based Image Encryption Allowing Hierarchical Decryption 允许分层解密的基于块排列的图像加密
Yusuke Izawa, Shoko Imaizumi, H. Kiya
{"title":"A Block-Permutation-Based Image Encryption Allowing Hierarchical Decryption","authors":"Yusuke Izawa, Shoko Imaizumi, H. Kiya","doi":"10.23919/APSIPA.2018.8659479","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659479","url":null,"abstract":"This paper proposes a block-permutation-based encryption (BPBE) scheme, which allows only decrypting particular regions in the encrypted image. It is difficult to perform partial decryption in the conventional scheme, because it encrypts the entire image at once. By composing regions in the original image, we can conduct the hierarchical encryption and achieve the partial decryption in the proposed scheme. Additionally, the proposed scheme can maintain the JPEG-LS compression efficiency of the encrypted images compared to the conventional scheme. Moreover, the resilience against jigsaw puzzle solving problems can be enhanced by applying the proposed scheme to the combined images. We further consider an efficient key management by using hash chains.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127869386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Speech Processing Strategy based on Sinusoidal Speech Model for Cochlear Implant Users 基于正弦语音模型的人工耳蜗用户语音处理策略
Sungmin Lee, Sara Akbarzadeh, Satnam Singh, Chin-Tuan Tan
{"title":"A Speech Processing Strategy based on Sinusoidal Speech Model for Cochlear Implant Users","authors":"Sungmin Lee, Sara Akbarzadeh, Satnam Singh, Chin-Tuan Tan","doi":"10.23919/APSIPA.2018.8659620","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659620","url":null,"abstract":"In sinusoidal modeling(SM), speech signal, which is pseudo-periodic in structure, can be approximated by sinusoids and noise without losing significant speech information. A speech processing strategy based on this sinusoidal speech model will be relevant for encoding electric pulse streams in cochlear implant (CI) processing, where the number of channels available is limited. In this study, 5 normal hearing(NH) listeners and 2 CI users were asked to perform the task of speech recognition and perceived sound quality rating on speech sentences processed in 12 different test conditions. The sinusoidal analysis/synthesis algorithm was limited to 1, 3 or 6 sinusoids from the sentences low-pass filtered at either 1 kHz, 1.5 kHz, 3 kHz, or 6 kHz, re-synthesized as the test conditions. Each of 12 lists of AzBio sentences was randomly chosen and process with one of 12 test conditions, before they were presented to each participant at 65 dB SPL (Sound Pressure Level). Participant was instructed to repeat the sentence as they perceived, and the number of words correctly recognized was scored. They were also asked to rate the perceived sound quality of the sentences including original speech sentence, on the scale of 1 (distorted) to 10 (clean). Both speech recognition score and perceived sound quality rating across all participants increase when the number of sinusoids increases and low-pass filter broadens. Our current finding showed that three sinusoids may be sufficient to elicit the nearly maximum speech intelligibility and quality necessary for both NH and CI listeners. Sinusoidal speech model has the potential in facilitating the basis for a speech processing strategy in CI.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127386133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly Labeled Learning Using BLSTM-CTC for Sound Event Detection 基于BLSTM-CTC的弱标记学习用于声音事件检测
Taiki Matsuyoshi, Tatsuya Komatsu, Reishi Kondo, Takeshi Yamada, S. Makino
{"title":"Weakly Labeled Learning Using BLSTM-CTC for Sound Event Detection","authors":"Taiki Matsuyoshi, Tatsuya Komatsu, Reishi Kondo, Takeshi Yamada, S. Makino","doi":"10.23919/APSIPA.2018.8659528","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659528","url":null,"abstract":"In this paper, we propose a method of weakly labeled learning of bidirectional long short-term memory (BLSTM) using connectionist temporal classification (BLSTM-CTC) to reduce the hand-labeling cost of learning samples. BLSTM-CTC enables us to update the parameters of BLSTM by loss calculation using CTC, instead of the exact error calculation that cannot be conducted when using weakly labeled samples, which have only the event class of each individual sound event. In the proposed method, we first conduct strongly labeled learning of BLSTM using a small amount of strongly labeled samples, which have the timestamps of the beginning and end of each individual sound event and its event class, as initial learning. We then conduct weakly labeled learning based on BLSTM-CTC using a large amount of weakly labeled samples as additional learning. To evaluate the performance of the proposed method, we conducted a sound event detection experiment using the dataset provided by Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 2. As a result, the proposed method improved the segment-based F1 score by 1.9% compared with the initial learning mentioned above. Furthermore, it succeeded in reducing the labeling cost by 95%, although the F1 score was degraded by 1.3%, comparing with additional learning using a large amount of strongly labeled samples. This result confirms that our weakly labeled learning is effective for learning BLSTM with a low hand-labeling cost.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133795308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Study on Indoor Dimming Method Utilizing Outside Light for Power Saving 利用外界光节能的室内调光方法研究
Kengo Sasaki, E. Okamoto
{"title":"A Study on Indoor Dimming Method Utilizing Outside Light for Power Saving","authors":"Kengo Sasaki, E. Okamoto","doi":"10.23919/APSIPA.2018.8659602","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659602","url":null,"abstract":"In the next generation power networks, more energy saving and energy-efficient network are required. One of the solutions is a location-aware energy distribution scheme, where persons' location is accurately estimated by a centimeter-order indoor localization scheme and the energy is preferentially allocated to the electric equipment near the persons. As one of its applications, there is an energy-saving indoor lighting control scheme exploiting person's location information and the estimated illumination intensity, and large energy saving effects are obtained. We have proposed an indoor diming scheme that considers an external light in previous studies. However, in the previous study, advanced intensity measurements at many reference points were required. Therefore, in this paper, we propose an energy-saving indoor lighting control method that uses an estimated external light to reduce the measurement points. Numerical results show the advanced performance of the proposed method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"323 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124295125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Diversification Strategy for IIR Filter Design Using PSO 基于粒子群算法的IIR滤波器多样化设计策略
Y. Takase, K. Suyama
{"title":"A Diversification Strategy for IIR Filter Design Using PSO","authors":"Y. Takase, K. Suyama","doi":"10.23919/APSIPA.2018.8659771","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659771","url":null,"abstract":"IIR (Infinite Impulse Response) filter design problem is a non-linear optimization problem. Because PSO (Particle Swarm Optimization) can enumerate solution candidates quickly, it is known as an effective method for such a problem. However, PSO has a drawback that tends to indicate a premature convergence due to a strong directivity. In this paper, PSS (Problem Space Stretch)-PSO is verified to avoid the local minimum stagnation. Several design examples are shown to present the effectiveness of the method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116006038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信