{"title":"An Application-Oriented Taxonomy on Spoofing, Disguise and Countermeasures in Speaker Recognition","authors":"Lantian Li, Xingliang Cheng, T. Zheng","doi":"10.1561/116.00000017","DOIUrl":"https://doi.org/10.1561/116.00000017","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67080002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Automatic Areas of Interest (AOI) Bounding Boxes Estimation Algorithm for Dynamic Eye-Tracking Stimuli","authors":"E. A. Lagmay, M. M. Rodrigo","doi":"10.1561/116.00000026","DOIUrl":"https://doi.org/10.1561/116.00000026","url":null,"abstract":"In eye-tracking research, an area of interest (AOI) is defined as any object in the visual stimuli which is/are focused on by the viewer, defined with bounding boxes of any shape. If a study makes use of a small number of static visual stimuli, then researchers may define AOIs manually. However, if the stimuli is dynamic, then manual AOI definition is not efficient or scalable. This paper presents the Enhanced Automatic AOI Bounding Boxes Estimation Algorithm which automatically esti-mates the AOI bounding boxes in dynamic stimuli using simple image segmentation techniques. This algorithm is an improvement on the Automatic AOI Bounding Boxes Estimation Algorithm. It uses a faster version of the SLIC algorithm which utilizes the AVX2 SIMD (Single Instruction, Multiple Data) parallelization paradigm, and replaces the second K-Means Image Segmentation procedure at the end of the pre-∗ and in the evaluation of the the end results of the Enhanced Automatic AOI Bounding Boxes Estimation vious version of the algorithm with Region Adjacency Graph (RAG) Thresholding. The evaluation of the overall results of the new algorithm shows that the Enhanced Automatic AOI Bounding Boxes Estimation Algorithm is superior to its predecessor both in terms of accuracy (recall and precision) and efficiency (benchmarking).","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Chord and Key Estimation Based on a Hierarchical Variational Autoencoder with Multi-task Learning","authors":"Yiming Wu, Kazuyoshi Yoshii","doi":"10.1561/116.00000052","DOIUrl":"https://doi.org/10.1561/116.00000052","url":null,"abstract":"This paper describes a deep generative approach to joint chord and key estimation for music signals. The limited amount of music signals with complete annotations has been the major bottleneck in supervised multi-task learning of a classification model. To overcome this limitation, we integrate the supervised multi-task learning approach with the unsupervised autoencoding approach in a mutually complementary manner. Considering the typical process of music composition, we formulate a hierarchical latent variable model that sequentially generates keys, chords, and chroma vectors. The keys and chords are assumed to follow a language model that represents their relationships and dynamics. In the framework of amortized variational inference (AVI), we introduce a classification model that jointly infers discrete chord and key labels and a recognition model that infers continuous latent features. These models are combined to form a variational autoencoder (VAE) and are trained jointly in a (semi-)supervised manner, where the generative and language models act as regularizers for the classification model. We comprehensively investigate three different architectures for the chord and key classification model, and three different architectures for the language model. Experimental results demonstrate that the VAE-based multi-task learning improves chord estimation as well as key estimation.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial for the Special Issue on Information Processing for Understanding Human Attentional and Affective States","authors":"J. Yoshimoto, U. Obaidellah","doi":"10.1561/116.00000100","DOIUrl":"https://doi.org/10.1561/116.00000100","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial for the Special Issue on Multi-Disciplinary Dis/Misinformation Analysis and Countermeasures","authors":"Yuhong Liu","doi":"10.1561/116.00000101","DOIUrl":"https://doi.org/10.1561/116.00000101","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Content-Adaptive Level of Detail for Lossless Point Cloud Compression","authors":"Lei Wei, Shuai Wan, Fuzheng Yang, Zhecheng Wang","doi":"10.1561/116.00000004","DOIUrl":"https://doi.org/10.1561/116.00000004","url":null,"abstract":"The nonuniform distribution of points in a point cloud and their abundant attribute information (such as colour, reflectance, and normal) result in the generation of massive data, making point cloud compression (PCC) essential for related applications. The hierarchical structure of the level of detail (LOD) in a point cloud and the corresponding predictions are commonly used in PCC, whereas the current method of LOD generation is neither content adaptive nor optimized. Targeting lossless PCC, an LOD prediction error model is proposed in this work, based on which the prediction error is minimized to obtain the optimal coding performance. As a result, the process of generating LOD is optimized, where the smallest number of LOD levels that yields the minimum attribute bitrate can be found. The proposed method is evaluated on various standard datasets under common test conditions. Experimental results show that the proposed method achieves optimal coding performance in a content-adaptive way while significantly reducing the time required for encoding and decoding, i.e., ∼ 15.2% and ∼ 17.3% time savings on average for distance-based LOD, and ∼ 5.4% and ∼ 5.1% time savings for Morton-based LOD, respectively.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67079920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recurrent Neural Networks and Their Memory Behavior: A Survey","authors":"Yuanhang Su, C.-C. Jay Kuo","doi":"10.1561/116.00000123","DOIUrl":"https://doi.org/10.1561/116.00000123","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ITS-Net: Iterative Two-Stream Network for Image Super-Resolution","authors":"Wei Li, Yan Huang, Yilong Yin, Jingliang Peng","doi":"10.1561/116.00000018","DOIUrl":"https://doi.org/10.1561/116.00000018","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67080013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial for the Special Issue on Deep Neural Networks","authors":"Li-Wei Kang, C. Yeh","doi":"10.1561/116.00000062","DOIUrl":"https://doi.org/10.1561/116.00000062","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duc-Quang Vu, Trang Phung Thi Thu, Ngan T. H. Le, Jia-Ching Wang
{"title":"Deep Learning for Human Action Recognition: A Comprehensive Review","authors":"Duc-Quang Vu, Trang Phung Thi Thu, Ngan T. H. Le, Jia-Ching Wang","doi":"10.1561/116.00000068","DOIUrl":"https://doi.org/10.1561/116.00000068","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}