APSIPA Transactions on Signal and Information Processing最新文献

筛选
英文 中文
End-to-end Japanese Multi-dialect Speech Recognition and Dialect Identification with Multi-task Learning 端到端日语多方言语音识别与多任务学习的方言识别
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000045
Ryo Imaizumi, Ryo Masumura, Sayaka Shiota, H. Kiya
{"title":"End-to-end Japanese Multi-dialect Speech Recognition and Dialect Identification with Multi-task Learning","authors":"Ryo Imaizumi, Ryo Masumura, Sayaka Shiota, H. Kiya","doi":"10.1561/116.00000045","DOIUrl":"https://doi.org/10.1561/116.00000045","url":null,"abstract":"End-to-end systems have demonstrated state-of-the-art performance on many tasks related to automatic speech recognition (ASR) and dialect identification (DID). In this paper, we propose multi-task learning of Japanese DID and multi-dialect ASR (MD-ASR) systems with end-to-end models. Since Japanese dialects have variety in both linguistic and acoustic aspects of each dialect, Japanese DID requires simultaneously considering linguistic and acoustic features. One solution realizing Japanese DID using these features is to use transcriptions from ASR when performing DID. However, transcribing Japanese multi-dialect speech into text is regarded as a challenging task in ASR because there are big gaps in linguistic and acoustic features between a dialect and standard Japanese. One solution is dialect-aware ASR modeling, which means DID is performed with ASR. Therefore, the multi-task learning framework of Japanese DID and ASR is proposed to represent the dependency of them. We explore three systems as part of the proposed framework, changing the order in which DID and ASR are performed. In the experiments, Japanese multi-dialect ASR and DID tests were conducted on our home-made Japanese multi-dialect database and a standard Japanese database. The proposed transformer-based systems outperformed the conventional single task systems on both DID and ASR tests.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Identifying Code Reading Strategies in Debugging using STA with a Tolerance Algorithm 用容错算法识别STA调试中的代码阅读策略
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000040
Christine Lourrine S. Tablatin, M. M. Rodrigo
{"title":"Identifying Code Reading Strategies in Debugging using STA with a Tolerance Algorithm","authors":"Christine Lourrine S. Tablatin, M. M. Rodrigo","doi":"10.1561/116.00000040","DOIUrl":"https://doi.org/10.1561/116.00000040","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DeepFake and its Enabling Techniques: A Review DeepFake及其使能技术综述
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000024
R. Brooks, Yefeng Yuan, Yuhong Liu, Haiquan Chen
{"title":"DeepFake and its Enabling Techniques: A Review","authors":"R. Brooks, Yefeng Yuan, Yuhong Liu, Haiquan Chen","doi":"10.1561/116.00000024","DOIUrl":"https://doi.org/10.1561/116.00000024","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67079629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
American Sign Language Fingerspelling Recognition in the Wild with Iterative Language Model Construction 基于迭代语言模型构建的野外美国手语拼写识别
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000003
W. Kumwilaisak, Peerawat Pannattee, C. Hansakunbuntheung, N. Thatphithakkul
{"title":"American Sign Language Fingerspelling Recognition in the Wild with Iterative Language Model Construction","authors":"W. Kumwilaisak, Peerawat Pannattee, C. Hansakunbuntheung, N. Thatphithakkul","doi":"10.1561/116.00000003","DOIUrl":"https://doi.org/10.1561/116.00000003","url":null,"abstract":"This paper proposes a novel method to improve the accuracy of the American Sign Language fingerspelling recognition. Video sequences from the training set of the “ChicagoFSWild” dataset are first utilized for training a deep neural network of weakly supervised learning to generate frame labels from a sequence label automatically. The network of weakly supervised learning contains the AlexNet and the LSTM. This trained network generates a collection of frame-labeled images from the training video sequences that have Levenshtein distance between the predicted sequence and the sequence label equal to zero. The negative and positive pairs of all fingerspelling gestures are randomly formed from the collected image set. These pairs are adopted to train the Siamese network of the ResNet-50 and the projection function to produce efficient feature representations. The trained Resnet-50 and the projection function are concatenated with the bidirectional LSTM, a fully connected layer, and a softmax layer to form a deep neural network for the American Sign Language fingerspelling recognition. With the training video sequences, video frames corresponding to the video sequences that have Levenshtein distance between the predicted sequence and the sequence label equal to zero are added to the collected image set. The updated collected image set is used to train the Siamese network. The training process, from training the Siamese network to the update of the collected image set, is iterated until the image recognition performance is not further enhanced. The experimental results from the “ChicagoFSWild” dataset show that the proposed method surpasses the existing works in terms of the character error rate.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67079886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Self-Supervised Motion-Corrected Image Reconstruction Network for 4D Magnetic Resonance Imaging of the Body Trunk 躯干四维磁共振自监督运动校正图像重建网络
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000039
T. Küstner, Jiazhen Pan, Christopher Gilliam, H. Qi, G. Cruz, K. Hammernik, T. Blu, D. Rueckert, René M. Botnar, C. Prieto, S. Gatidis
{"title":"Self-Supervised Motion-Corrected Image Reconstruction Network for 4D Magnetic Resonance Imaging of the Body Trunk","authors":"T. Küstner, Jiazhen Pan, Christopher Gilliam, H. Qi, G. Cruz, K. Hammernik, T. Blu, D. Rueckert, René M. Botnar, C. Prieto, S. Gatidis","doi":"10.1561/116.00000039","DOIUrl":"https://doi.org/10.1561/116.00000039","url":null,"abstract":"Respiratory motion can cause artifacts in magnetic resonance imaging of the body trunk if patients cannot hold their breath or triggered acquisitions are not practical. Retrospective correction strategies usually cope with motion by fast imaging sequences under free-movement conditions followed by motion binning based on motion traces. These acquisitions yield sub-Nyquist sampled and motion-resolved k-space data. Motion states are linked to each other by non-rigid deformation fields. Usually, motion registration is formulated in image space which can however be impaired by aliasing artifacts or by estimation from low-resolution images. Subsequently, any motion-corrected reconstruction can be biased by errors in the deformation fields. In this work, we propose a deep-learning based motion-corrected 4D (3D spatial + time) image reconstruction which combines a non-rigid registration network and a 4D reconstruction network. Non-rigid motion is estimated in k-space and incorporated into the reconstruction network. The proposed method is evaluated on in-vivo 4D motion-resolved magnetic resonance images of patients with suspected liver or lung metastases and healthy subjects. The proposed approach provides 4D motion-corrected images and deformation fields. It enables a ∼ 14 × accelerated acquisition with a 25-fold faster reconstruction than comparable approaches under consistent preservation of image quality for changing patients and motion patterns.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The Future of Video Coding 视频编码的未来
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000044
N. Ling, C.-C. Jay Kuo, G. Sullivan, Dong Xu, Shan Liu, H. Hang, Wen-Hsiao Peng, Jiaying Liu
{"title":"The Future of Video Coding","authors":"N. Ling, C.-C. Jay Kuo, G. Sullivan, Dong Xu, Shan Liu, H. Hang, Wen-Hsiao Peng, Jiaying Liu","doi":"10.1561/116.00000044","DOIUrl":"https://doi.org/10.1561/116.00000044","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Combating Misinformation/ Disinformation in Online Social Media: A Multidisciplinary View 打击在线社交媒体中的错误信息/虚假信息:多学科视角
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.0000127
M. Barni, Y. Fang, Yuhong Liu, Laura Robinson, K. Sasahara, Subramaniam Vincent, Xinchao Wang, Zhizheng Wu
{"title":"Combating Misinformation/ Disinformation in Online Social Media: A Multidisciplinary View","authors":"M. Barni, Y. Fang, Yuhong Liu, Laura Robinson, K. Sasahara, Subramaniam Vincent, Xinchao Wang, Zhizheng Wu","doi":"10.1561/116.0000127","DOIUrl":"https://doi.org/10.1561/116.0000127","url":null,"abstract":"Recently, the viral propagation of mis/disinformation has raised significant concerns from both academia and industry. This problem is particularly difficult because on the one hand, rapidly evolving technology makes it much cheaper and easier to manipulate and propagate social media information. On the other hand, the complexity of human psychology and sociology makes the understanding, prediction and prevention of users' involvement in mis/disinformation propagation very difficult. This themed series on \"Multi-Disciplinary Dis/Misinformation Analysis and Countermeasures\" aims to bring the attention and efforts from researchers in relevant disciplines together to tackle this challenging problem. In addition, on October 20th, 2021, and March 7th 2022, some of the guest editorial team members organized two panel discussions on \"Social Media Disinformation and its Impact on Public Health During the COVID-19 Pandemic,\" and on \"Dis/Misinformation Analysis and Countermeasures - A Computational Viewpoint.\" This article summarizes the key discussion items at these two panels and hopes to shed light on the future directions.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
UHP-SOT++: An Unsupervised Lightweight Single Object Tracker uhp - sot++:一种无监督轻量级单目标跟踪器
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000008
Zhiruo Zhou, Hongyu Fu, Suya You, C. J. Kuo
{"title":"UHP-SOT++: An Unsupervised Lightweight Single Object Tracker","authors":"Zhiruo Zhou, Hongyu Fu, Suya You, C. J. Kuo","doi":"10.1561/116.00000008","DOIUrl":"https://doi.org/10.1561/116.00000008","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67079945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The Future of Computer Vision 计算机视觉的未来
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000009
Jingjing Meng, Xilin Chen, Jurgen Gall, Chang-Su Kim, Zicheng Liu, A. Piva, Junsong Yuan
{"title":"The Future of Computer Vision","authors":"Jingjing Meng, Xilin Chen, Jurgen Gall, Chang-Su Kim, Zicheng Liu, A. Piva, Junsong Yuan","doi":"10.1561/116.00000009","DOIUrl":"https://doi.org/10.1561/116.00000009","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"1 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67079954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine Learning for Wireless Communication: An Overview 无线通信的机器学习:概述
IF 3.2
APSIPA Transactions on Signal and Information Processing Pub Date : 2022-01-01 DOI: 10.1561/116.00000029
Zijian Cao, Huan Zhang, Le Liang, Geoffrey Ye Li
{"title":"Machine Learning for Wireless Communication: An Overview","authors":"Zijian Cao, Huan Zhang, Le Liang, Geoffrey Ye Li","doi":"10.1561/116.00000029","DOIUrl":"https://doi.org/10.1561/116.00000029","url":null,"abstract":"","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":"500 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信