基于对比学习的声预训练枪响识别

Xianjie Shen, Saimin Ma, Linlin Yang, Yubo Jiang, Zhifeng Xiao, Shuren Xu
{"title":"基于对比学习的声预训练枪响识别","authors":"Xianjie Shen, Saimin Ma, Linlin Yang, Yubo Jiang, Zhifeng Xiao, Shuren Xu","doi":"10.1145/3603781.3603908","DOIUrl":null,"url":null,"abstract":"Gun control has become a serious social and political issue in some countries. Automatic, accurate, and fast gunshot recognition technology can assist police in the identification of gun caliber, thus help better track the suspect, speeding up the process of criminal investigation. Recent development in deep learning has brought new opportunities in the area of speech/acoustic recognition. However, lack of sufficient training examples remains a challenge for the training of a robust model. In this paper, we propose an acoustic pre-training method with contrastive learning to capture gunshot-like voice in a rich collection of urban sounds. Specifically, we develop an encoder-decoder model that utilizes more typical samples from external datasets to mine semantic acoustic features in a self-supervised manner. The pre-trained network is then fine-tuned in the downstream task for gunshot recognition. Extensive experiments demonstrate the superiority of our methods compared to existing machine learning methods.","PeriodicalId":391180,"journal":{"name":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Acoustic Pre-training with Contrastive Learning for Gunshot Recognition\",\"authors\":\"Xianjie Shen, Saimin Ma, Linlin Yang, Yubo Jiang, Zhifeng Xiao, Shuren Xu\",\"doi\":\"10.1145/3603781.3603908\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gun control has become a serious social and political issue in some countries. Automatic, accurate, and fast gunshot recognition technology can assist police in the identification of gun caliber, thus help better track the suspect, speeding up the process of criminal investigation. Recent development in deep learning has brought new opportunities in the area of speech/acoustic recognition. However, lack of sufficient training examples remains a challenge for the training of a robust model. In this paper, we propose an acoustic pre-training method with contrastive learning to capture gunshot-like voice in a rich collection of urban sounds. Specifically, we develop an encoder-decoder model that utilizes more typical samples from external datasets to mine semantic acoustic features in a self-supervised manner. The pre-trained network is then fine-tuned in the downstream task for gunshot recognition. Extensive experiments demonstrate the superiority of our methods compared to existing machine learning methods.\",\"PeriodicalId\":391180,\"journal\":{\"name\":\"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3603781.3603908\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3603781.3603908","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在一些国家,枪支管制已经成为一个严重的社会和政治问题。自动、准确、快速的枪支识别技术可以协助警方识别枪支口径,从而帮助更好地追踪嫌疑人,加快刑事侦查进程。深度学习的最新发展为语音/声学识别领域带来了新的机遇。然而,缺乏足够的训练样本仍然是鲁棒模型训练的一个挑战。在本文中,我们提出了一种基于对比学习的声学预训练方法,用于在丰富的城市声音集合中捕获类似枪声的声音。具体来说,我们开发了一个编码器-解码器模型,该模型利用来自外部数据集的更多典型样本以自监督的方式挖掘语义声学特征。然后在下游任务中对预训练的网络进行微调,以进行枪声识别。大量的实验证明了我们的方法与现有机器学习方法相比的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Acoustic Pre-training with Contrastive Learning for Gunshot Recognition
Gun control has become a serious social and political issue in some countries. Automatic, accurate, and fast gunshot recognition technology can assist police in the identification of gun caliber, thus help better track the suspect, speeding up the process of criminal investigation. Recent development in deep learning has brought new opportunities in the area of speech/acoustic recognition. However, lack of sufficient training examples remains a challenge for the training of a robust model. In this paper, we propose an acoustic pre-training method with contrastive learning to capture gunshot-like voice in a rich collection of urban sounds. Specifically, we develop an encoder-decoder model that utilizes more typical samples from external datasets to mine semantic acoustic features in a self-supervised manner. The pre-trained network is then fine-tuned in the downstream task for gunshot recognition. Extensive experiments demonstrate the superiority of our methods compared to existing machine learning methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信