ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
MSN-net: Multi-Scale Normality Network for Video Anomaly Detection 用于视频异常检测的多尺度正态网络
Y. Liu, Di Li, Wei Zhu, Dingkang Yang, Jing Liu, Liang Song
{"title":"MSN-net: Multi-Scale Normality Network for Video Anomaly Detection","authors":"Y. Liu, Di Li, Wei Zhu, Dingkang Yang, Jing Liu, Liang Song","doi":"10.1109/ICASSP49357.2023.10097052","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10097052","url":null,"abstract":"Existing unsupervised video anomaly detection methods often suffer from performance degradation due to the overgeneralization of deep models. In this paper, we propose a simple yet effective Multi-Scale Normality network (MSN-net) that uses hierarchical memories to learn multi-level prototypical spatial-temporal patterns of normal events. Specifically, the hierarchical memory module interacts with the encoder through the reading and writing operations during the training phase, preserving multi-scale normality in three separate memory pools. Then, the decoder decodes the features rewritten by the memorized normality to predict future frames so that its ability to predict anomalies is diminished. Experimental results show that MSN-net performs comparably to the state-of-the-art methods, and extension analysis demonstrates the effectiveness of multi-scale normality learning.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"72 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113944249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust Log-Based Anomaly Detection with Hierarchical Contrastive Learning 基于层次对比学习的鲁棒日志异常检测
Yuhui Zhao, Ruichun Yang, Ning Yang, Tao Lin, Qiuai Fu, Yuchi Ma
{"title":"Robust Log-Based Anomaly Detection with Hierarchical Contrastive Learning","authors":"Yuhui Zhao, Ruichun Yang, Ning Yang, Tao Lin, Qiuai Fu, Yuchi Ma","doi":"10.1109/icassp49357.2023.10094961","DOIUrl":"https://doi.org/10.1109/icassp49357.2023.10094961","url":null,"abstract":"Logs are widely employed in modern systems to record critical information and serve as an important source for anomaly detection, which has attracted increasing research interests. However, logs usually suffer from perturbations and it makes the existing log-based anomaly detection methods unstable. In this paper, we aim to solve this problem from the perspective of contrastive learning, by which the intrinsic and robust representations of logs are learned for anomaly detection. We propose two data augmentation methods to generate different views at different granularity for log data and design a deep hierarchical contrastive model for anomaly detection. In the contrastive semantic embedding module, we fine-tune a language model with a message-level contrastive loss. And in the contrastive anomaly detection module, we apply a sequence-level contrastive constraint to assist the detection model to learn robust embeddings for log sequences. Experiments on three datasets verify the effectiveness of our proposed method.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"75 34","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114005458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tensor Completion for Efficient and Accurate Hyperparameter Optimisation in Large-Scale Statistical Learning 大规模统计学习中高效准确超参数优化的张量补全
Aaman Rebello, Kriton Konstantinidis, Y. Xu, D. Mandic
{"title":"Tensor Completion for Efficient and Accurate Hyperparameter Optimisation in Large-Scale Statistical Learning","authors":"Aaman Rebello, Kriton Konstantinidis, Y. Xu, D. Mandic","doi":"10.1109/ICASSP49357.2023.10096491","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10096491","url":null,"abstract":"Hyperparameter optimisation is a prerequisite for state-of-the- art performance in machine learning, with current strategies including Bayesian optimisation, hyperband, and evolutionary methods. While such methods have been shown to improve performance, none of these is designed to explicitly take advantage of the underlying data structure. To this end, we introduce a completely different approach for hyperparameter optimisation, based on low-rank tensor completion. This is achieved by first forming a multi-dimensional tensor which comprises performance scores for different combinations of hyperparameters. Based on the realistic underlying assumption that the so-formed tensor has a low-rank structure, reliable estimates of the unobserved validation scores of combinations of hyper- parameters are next obtained through tensor completion, from only a fraction of the known elements in the tensor. Through extensive experimentation on various datasets and learning models, the proposed method is shown to exhibit competitive or superior performance to state-of-the-art hyperparameter optimisation strategies. Distinctive advantages of the proposed method include its ability to simultaneously handle any hyper- parameter type (kind of optimiser, number of neurons, number of layer, etc.), its relative simplicity compared to competing methods, as well as the ability to suggest multiple optimal combinations of hyperparameters.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114796818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Iterative Solution for Linear Array-Based 3-D Localization by Message Passing 基于消息传递的线性阵列三维定位鲁棒迭代解
Yimao Sun, K. Ho, Yanbing Yang, Lei Zhang, Liangyin Chen
{"title":"Robust Iterative Solution for Linear Array-Based 3-D Localization by Message Passing","authors":"Yimao Sun, K. Ho, Yanbing Yang, Lei Zhang, Liangyin Chen","doi":"10.1109/ICASSP49357.2023.10095795","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10095795","url":null,"abstract":"Recent research has shown that using the 1-D signal arrival angles observed by linear arrays can locate a 3-D source in unique co-ordinates. Current methods to solve this localization problem are based on semidefinite programming (SDP) or gradient-based iteration, which are either computationally demanding or facing divergence or local convergence issues. This paper reformulates the maxi-mum likelihood (ML) estimation of the 3-D localization problem using the factor graph model, where an effective algorithm is designed through message passing. Although iterative, the proposed solution is more robust to measurement noise than the Gauss-Newton (GN) iterative solution, and the complexity is lower than the SDP solution without the need to introduce semidefinite relaxation error. Simulations validate the analytical performance and complexity, and con-firm the superiority on the convergence of the proposed solution.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124378996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Hypergraph Recurrent Attention Network for Temporal Knowledge Graph Reasoning 时间知识图推理的层次超图递归注意网络
Jiayan Guo, Meiqi Chen, Yan Zhang, Jianqiang Huang, Zhiwei Liu
{"title":"Hierarchical Hypergraph Recurrent Attention Network for Temporal Knowledge Graph Reasoning","authors":"Jiayan Guo, Meiqi Chen, Yan Zhang, Jianqiang Huang, Zhiwei Liu","doi":"10.1109/ICASSP49357.2023.10095378","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10095378","url":null,"abstract":"Temporal knowledge graph (TKG) serves as an essential tool in modeling complex event relations among real-world entities. A temporal knowledge graph can be viewed as a collection of knowledge graph snapshots ordered by time. Reasoning over such graphs remains nontrivial as temporal causal dependencies between events are hard to capture. Current TKG reasoning methods only model pair-wise relations, which are limited in capturing higher-order dependencies between entities that are beyond dyadic connections. In this work, we aim to capture higher-order interactions of entities for TKG reasoning. To achieve this goal, we develop a Hierarchical Hypergraph Recurrent Attention Network on the type-induced entity hypergraph with multiple hierarchies to model the evolutionary pattern under different semantic granularities. The experimental analysis on benchmark datasets demonstrates the proposed model's superiority and elucidates the rationality of the hierarchical hypergraph modeling.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124131426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Adaptive Superpixels For Hadamard Single Pixel Imaging In Near-Infrared Spectrum 近红外光谱中Hadamard单像素成像的深度自适应超像素
Brayan Monroy, Jorge Bacca, H. Arguello
{"title":"Deep Adaptive Superpixels For Hadamard Single Pixel Imaging In Near-Infrared Spectrum","authors":"Brayan Monroy, Jorge Bacca, H. Arguello","doi":"10.1109/ICASSP49357.2023.10095165","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10095165","url":null,"abstract":"Hadamard single-pixel imaging (HSI) is a promising sensing approach for acquiring spectral images in the near-infrared spectrum with high spatial resolution and fast recovery times due to the efficient invertible properties of the Hadamard matrix. The potential of the HSI system is diminished because of the large number of required measurements which implies long acquisition times. Recent advances proposed optimizing the HSI sensing matrix structure based on a superpixels map estimated from a side-information acquisition of the scene, reducing the number of required measurements. However, these matrix designs are detached from the recovery task, which falls on a sub-optimal strategy. In this work, we proposed an adaptive end-to-end sensing methodology for the HSI sensing matrix design based on deep superpixels estimation by coupling the sensing and recovery of the near-infrared spectral images. Experimental results show the superiority of the proposed sensing methodology compared with state-of-art sensing design schemes.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124227113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stereoscopic Video Retargeting Based on Camera Motion Classification 基于摄像机运动分类的立体视频重定位
L. Cai, Zhenhua Tang
{"title":"Stereoscopic Video Retargeting Based on Camera Motion Classification","authors":"L. Cai, Zhenhua Tang","doi":"10.1109/ICASSP49357.2023.10094758","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10094758","url":null,"abstract":"The existing stereo video retargeting algorithms commonly use a same methodology to perform resizing without considering different videos with various features, leading to the low quality of reconstructed videos. To address this issue, we propose a stereo video retargeting method based on camera motion classification, which employs different retargeting strategies to rescale stereo videos. We also design an adaptive stereo video classification method which determines the types of camera motion according to the distribution of motion vectors extracted from the left view of stereo videos. Besides, we develop a motion saliency detection method to eliminate the jittering of moving objects during video resizing. Experimental results show that the qualities of retargeted videos produced by our method are significantly superior to those of existing methods.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain Adaptation with External Off-Policy Acoustic Catalogs for Scalable Contextual End-to-End Automated Speech Recognition 领域适应与外部的政策声目录可扩展上下文端到端自动语音识别
David Chan, Shalini Ghosh, A. Rastrow, Björn Hoffmeister
{"title":"Domain Adaptation with External Off-Policy Acoustic Catalogs for Scalable Contextual End-to-End Automated Speech Recognition","authors":"David Chan, Shalini Ghosh, A. Rastrow, Björn Hoffmeister","doi":"10.1109/ICASSP49357.2023.10094924","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10094924","url":null,"abstract":"Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning). In this work, we investigate the potential of leveraging external knowledge, particularly through off-policy generated text-to-speech key-value stores, to allow for flexible post-training adaptation to new data distributions. In our approach, audio embeddings captured from text-to-speech are used, along with semantic text embeddings, to bias ASR via an approximate k-nearest-neighbor (KNN) based attentive fusion step. Our experiments on LibiriSpeech and in-house voice assistant/search datasets show that the proposed approach can reduce domain adaptation time by up to 1K GPU-hours while providing up to 3% WER improvement compared to a fine-tuning baseline, suggesting a promising approach for adapting production ASR systems in challenging zero and few-shot scenarios.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126560684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Ustc System for Adress-m Challenge Ustc address -m挑战系统
Kangdi Mei, Xinyun Ding, Yinlong Liu, Zhiqiang Guo, Feiyang Xu, Xin Li, Tuya Naren, Jiahong Yuan, Zhenhua Ling
{"title":"The Ustc System for Adress-m Challenge","authors":"Kangdi Mei, Xinyun Ding, Yinlong Liu, Zhiqiang Guo, Feiyang Xu, Xin Li, Tuya Naren, Jiahong Yuan, Zhenhua Ling","doi":"10.1109/ICASSP49357.2023.10094714","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10094714","url":null,"abstract":"This paper describes our submission to the ICASSP 2023 Signal Processing Grand Challenge (SPGC), which focuses on multilingual Alzheimer’s disease (AD) recognition through spontaneous speech. Our approaches include using a variety of acoustic features and silence-related information for AD detection and mini-mental state examination (MMSE) score prediction, and fine-tuning wav2vec2.0 models on speech in various frequency bands for AD detection. Our overall results on the test data outperform the baseline provided by the organizers, achieving 73.9% accuracy in AD detection by fine-tuning our bilingual wav2vec2.0 pre-trained model on the 0-1000Hz frequency band speech, and 4.610 RMSE (r = 0.565) in MMSE prediction through the fusion of eGeMAPS and silence features.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126400653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
COVID-19 Detection from Speech in Noisy Conditions 基于噪声条件下语音的COVID-19检测
Shuo Liu, Adria Mallol-Ragolta, B. Schuller
{"title":"COVID-19 Detection from Speech in Noisy Conditions","authors":"Shuo Liu, Adria Mallol-Ragolta, B. Schuller","doi":"10.1109/ICASSP49357.2023.10094304","DOIUrl":"https://doi.org/10.1109/ICASSP49357.2023.10094304","url":null,"abstract":"We explore the integration of audio enhancement into a speech-based COVID-19 detection system in an attempt to make speech captured in noisy environments from everyday life useful for the detection of the virus. For this purpose, two multi-task learning approaches are exploited to jointly optimise a front-end speech enhancement model and a subsequent COVID-19 detection model. In comparison to several baseline methods, such as noisy data augmentation, cold cascade of speech enhancement, and COVID-19 models, our proposed solutions are able to recover a substantial percentage of the performance reduction caused by real-world noises. Our best-performing model, which is trained using the synthetic data of the DiCOVA speech corpus and AudioSet environmental backgrounds, can achieve an average AUC of 76.87 % on the test data covering a wide range of noise intensities, which is over 10 % better than a COVID-19 model trained with clean audio.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128099005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信