APSIPA Transactions on Signal and Information Processing最新文献_第10页

Demystifying data and AI for manufacturing: case studies from a major computer maker 为制造业揭开数据和人工智能的神秘面纱：一家大型计算机制造商的案例研究

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2021-03-08 DOI: 10.1017/ATSIP.2021.3

Yi-Chun Chen, Bo-Huei He, Shih-Sung Lin, Jonathan Hans Soeseno, Daniel Stanley Tan, Trista Pei-chun Chen, Wei-Chao Chen

{"title":"Demystifying data and AI for manufacturing: case studies from a major computer maker","authors":"Yi-Chun Chen, Bo-Huei He, Shih-Sung Lin, Jonathan Hans Soeseno, Daniel Stanley Tan, Trista Pei-chun Chen, Wei-Chao Chen","doi":"10.1017/ATSIP.2021.3","DOIUrl":"https://doi.org/10.1017/ATSIP.2021.3","url":null,"abstract":"In this article, we discuss the backgrounds and technical details about several smart manufacturing projects in a tier-one electronics manufacturing facility. We devise a process to manage logistic forecast and inventory preparation for electronic parts using historical data and a recurrent neural network to achieve significant improvement over current methods. We present a system for automatically qualifying laptop software for mass production through computer vision and automation technology. The result is a reliable system that can save hundreds of man-years in the qualification process. Finally, we create a deep learning-based algorithm for visual inspection of product appearances, which requires significantly less defect training data compared to traditional approaches. For production needs, we design an automatic optical inspection machine suitable for our algorithm and process. We also discuss the issues for data collection and enabling smart manufacturing projects in a factory setting, where the projects operate on a delicate balance between process innovations and cost-saving measures.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2021.3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49632674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Toward community answer selection by jointly static and dynamic user expertise modeling 通过静态和动态用户专业知识建模实现社区答案选择

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2021-03-01 DOI: 10.1017/ATSIP.2020.28

Yuchao Liu, Meng Liu, Jianhua Yin

引用次数: 1

Subspace learning for facial expression recognition: an overview and a new perspective 面部表情识别的子空间学习:综述与新视角

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2021-01-14 DOI: 10.1017/ATSIP.2020.27

Cigdem Turan, Rui Zhao, K. Lam, Xiangjian He

{"title":"Subspace learning for facial expression recognition: an overview and a new perspective","authors":"Cigdem Turan, Rui Zhao, K. Lam, Xiangjian He","doi":"10.1017/ATSIP.2020.27","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.27","url":null,"abstract":"For image recognition, an extensive number of subspace-learning methods have been proposed to overcome the high-dimensionality problem of the features being used. In this paper, we first give an overview of the most popular and state-of-the-art subspace-learning methods, and then, a novel manifold-learning method, named soft locality preserving map (SLPM), is presented. SLPM aims to control the level of spread of the different classes, which is closely connected to the generalizability of the learned subspace. We also do an overview of the extension of manifold learning methods to deep learning by formulating the loss functions for training, and further reformulate SLPM into a soft locality preserving (SLP) loss. These loss functions are applied as an additional regularization to the learning of deep neural networks. We evaluate these subspace-learning methods, as well as their deep-learning extensions, on facial expression recognition. Experiments on four commonly used databases show that SLPM effectively reduces the dimensionality of the feature vectors and enhances the discriminative power of the extracted features. Moreover, experimental results also demonstrate that the learned deep features regularized by SLP acquire a better discriminability and generalizability for facial expression recognition.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.27","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46764150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning 基于多Agent强化学习的突发下行链路传输公平用户调度

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-12-30 DOI: 10.1561/116.00000028

Mingqi Yuan, Qi Cao, Man-On Pun, Yi Chen

{"title":"Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning","authors":"Mingqi Yuan, Qi Cao, Man-On Pun, Yi Chen","doi":"10.1561/116.00000028","DOIUrl":"https://doi.org/10.1561/116.00000028","url":null,"abstract":"In this work, we develop practical user scheduling algorithms for downlink bursty traffic with emphasis on user fairness. In contrast to the conventional scheduling algorithms that either equally divides the transmission time slots among users or maximizing some ratios without physcial meanings, we propose to use the 5%-tile user data rate (5TUDR) as the metric to evaluate user fairness. Since it is difficult to directly optimize 5TUDR, we first cast the problem into the stochastic game framework and subsequently propose a Multi-Agent Reinforcement Learning (MARL)-based algorithm to perform distributed optimization on the resource block group (RBG) allocation. Furthermore, each MARL agent is designed to take information measured by network counters from multiple network layers (e.g. Channel Quality Indicator, Buffer size) as the input states while the RBG allocation as action with a proposed reward function designed to maximize 5TUDR. Extensive simulation is performed to show that the proposed MARL-based scheduler can achieve fair scheduling while maintaining good average network throughput as compared to conventional schedulers.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2020-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49055886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A multi-branch ResNet with discriminative features for detection of replay speech signals 一种用于重放语音信号检测的具有判别特征的多分支ResNet

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-12-29 DOI: 10.1017/ATSIP.2020.26

Xingliang Cheng, Mingxing Xu, T. Zheng

{"title":"A multi-branch ResNet with discriminative features for detection of replay speech signals","authors":"Xingliang Cheng, Mingxing Xu, T. Zheng","doi":"10.1017/ATSIP.2020.26","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.26","url":null,"abstract":"Nowadays, the security of ASV systems is increasingly gaining attention. As one of the common spoofing methods, replay attacks are easy to implement but difficult to detect. Many researchers focus on designing various features to detect the distortion of replay attack attempts. Constant-Q cepstral coefficients (CQCC), based on the magnitude of the constant-Q transform (CQT), is one of the striking features in the field of replay detection. However, it ignores phase information, which may also be distorted in the replay processes. In this work, we propose a CQT-based modified group delay feature (CQTMGD) which can capture the phase information of CQT. Furthermore, a multi-branch residual convolution network, ResNeWt, is proposed to distinguish replay attacks from bonafide attempts. We evaluated our proposal in the ASVspoof 2019 physical access dataset. Results show that CQTMGD outperformed the traditional MGD feature, and the fusion with other magnitude-based and phase-based features achieved a further improvement. Our best fusion system achieved 0.0096 min-tDCF and 0.39% EER on the evaluation set and it outperformed all the other state-of-the-art methods in the ASVspoof 2019 physical access challenge.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2020-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.26","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43798813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

An evaluation of voice conversion with neural network spectral mapping models and WaveNet vocoder 用神经网络频谱映射模型和WaveNet声码器评估语音转换

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-11-25 DOI: 10.1017/ATSIP.2020.24

Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, T. Toda

{"title":"An evaluation of voice conversion with neural network spectral mapping models and WaveNet vocoder","authors":"Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, T. Toda","doi":"10.1017/ATSIP.2020.24","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.24","url":null,"abstract":"This paper presents an evaluation of parallel voice conversion (VC) with neural network (NN)-based statistical models for spectral mapping and waveform generation. The NN-based architectures for spectral mapping include deep NN (DNN), deep mixture density network (DMDN), and recurrent NN (RNN) models. WaveNet (WN) vocoder is employed as a high-quality NN-based waveform generation. In VC, though, owing to the oversmoothed characteristics of estimated speech parameters, quality degradation still occurs. To address this problem, we utilize post-conversion for the converted features based on direct waveform modifferential and global variance postfilter. To preserve the consistency with the post-conversion, we further propose a spectrum differential loss for the spectral modeling. The experimental results demonstrate that: (1) the RNN-based spectral modeling achieves higher accuracy with a faster convergence rate and better generalization compared to the DNN-/DMDN-based models; (2) the RNN-based spectral modeling is also capable of producing less oversmoothed spectral trajectory; (3) the use of proposed spectrum differential loss improves the performance in the same-gender conversions; and (4) the proposed post-conversion on converted features for the WN vocoder in VC yields the best performance in both naturalness and speaker similarity compared to the conventional use of WN vocoder.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.24","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44907118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

End-to-end recognition of streaming Japanese speech using CTC and local attention 基于CTC和局部关注的日语流媒体语音端到端识别

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-11-23 DOI: 10.1017/ATSIP.2020.23

Jiahao Chen, Ryota Nishimura, N. Kitaoka

{"title":"End-to-end recognition of streaming Japanese speech using CTC and local attention","authors":"Jiahao Chen, Ryota Nishimura, N. Kitaoka","doi":"10.1017/ATSIP.2020.23","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.23","url":null,"abstract":"Many end-to-end, large vocabulary, continuous speech recognition systems are now able to achieve better speech recognition performance than conventional systems. Most of these approaches are based on bidirectional networks and sequence-to-sequence modeling however, so automatic speech recognition (ASR) systems using such techniques need to wait for an entire segment of voice input to be entered before they can begin processing the data, resulting in a lengthy time-lag, which can be a serious drawback in some applications. An obvious solution to this problem is to develop a speech recognition algorithm capable of processing streaming data. Therefore, in this paper we explore the possibility of a streaming, online, ASR system for Japanese using a model based on unidirectional LSTMs trained using connectionist temporal classification (CTC) criteria, with local attention. Such an approach has not been well investigated for use with Japanese, as most Japanese-language ASR systems employ bidirectional networks. The best result for our proposed system during experimental evaluation was a character error rate of 9.87%.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2020-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.23","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48219837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Ground-distance segmentation of 3D LiDAR point cloud toward autonomous driving 面向自动驾驶的3D LiDAR点云地距分割

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-11-23 DOI: 10.1017/ATSIP.2020.21

Jian Wu, Qingxiong Yang

引用次数: 2

An SMLB-based OFDM receiver over impulsive noise environment 脉冲噪声环境下基于SMLB的OFDM接收机

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-11-20 DOI: 10.1017/ATSIP.2020.22

Chengbo Liu, Na Chen, M. Okada, Yafei Hou

引用次数: 0

Discreteness and group sparsity aware detection for uplink overloaded MU-MIMO systems 上行链路过载MU-MIMO系统的离散性和组稀疏性检测

IF 3.2

APSIPA Transactions on Signal and Information Processing Pub Date : 2020-10-06 DOI: 10.1017/ATSIP.2020.19

Ryo Hayakawa, Ayano Nakai-Kasai, K. Hayashi

{"title":"Discreteness and group sparsity aware detection for uplink overloaded MU-MIMO systems","authors":"Ryo Hayakawa, Ayano Nakai-Kasai, K. Hayashi","doi":"10.1017/ATSIP.2020.19","DOIUrl":"https://doi.org/10.1017/ATSIP.2020.19","url":null,"abstract":"This paper proposes signal detection methods for frequency domain equalization (FDE) based overloaded multiuser multiple input multiple output (MU-MIMO) systems for uplink Internet of things (IoT) environments, where a lot of IoT terminals are served by a base station having less number of antennas than that of IoT terminals. By using the fact that the transmitted signal vector has the discreteness and the group sparsity, we propose a convex discreteness and group sparsity aware (DGS) optimization problem for the signal detection. We provide an optimization algorithm for the DGS optimization on the basis of the alternating direction method of multipliers (ADMM). Moreover, we extend the DGS optimization into weighted DGS (W-DGS) optimization and propose an iterative approach named iterative weighted DGS (IW-DGS), where we iteratively solve the W-DGS optimization problem with the update of the parameters in the objective function. We also discuss the computational complexity of the proposed IW-DGS and show that we can reduce the order of the complexity by using the structure of the channel matrix. Simulation results show that the symbol error rate (SER) performance of the proposed method is close to that of the oracle zero forcing (ZF) method, which perfectly knows the activity of each IoT terminal.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2,"publicationDate":"2020-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.19","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47778910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2