用于警察无线电通信分析的语音识别。

Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff, Karen Livescu, Christopher Graziul
{"title":"用于警察无线电通信分析的语音识别。","authors":"Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff, Karen Livescu, Christopher Graziul","doi":"10.1109/slt61566.2024.10832157","DOIUrl":null,"url":null,"abstract":"<p><p>Police departments around the world use two-way radio for coordination. These broadcast police communications (BPC) are a unique source of information about everyday police activity and emergency response. Yet BPC are not transcribed, and their naturalistic audio properties make automatic transcription challenging. We collect a corpus of roughly 62,000 manually transcribed radio transmissions (<sup>~</sup>46 hours of audio) to evaluate the feasibility of automatic speech recognition (ASR) using modern recognition models. We evaluate the performance of off-the-shelf speech recognizers, models fine-tuned on BPC data, and customized end-to-end models. We find that both human and machine transcription is challenging in this domain. Large off-the-shelf ASR models perform poorly, but fine-tuned models can reach the approximate range of human performance. Our work suggests directions for future work, including analysis of short utterances and potential miscommunication in police radio interactions. We make our corpus and data annotation pipeline available to other researchers, to enable further research on recognition and analysis of police communication.</p>","PeriodicalId":74811,"journal":{"name":"SLT ... : ... IEEE Workshop on Spoken Language Technology : proceedings. IEEE Workshop on Spoken Language Technology","volume":"2024 ","pages":"906-912"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12180137/pdf/","citationCount":"0","resultStr":"{\"title\":\"SPEECH RECOGNITION FOR ANALYSIS OF POLICE RADIO COMMUNICATION.\",\"authors\":\"Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff, Karen Livescu, Christopher Graziul\",\"doi\":\"10.1109/slt61566.2024.10832157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Police departments around the world use two-way radio for coordination. These broadcast police communications (BPC) are a unique source of information about everyday police activity and emergency response. Yet BPC are not transcribed, and their naturalistic audio properties make automatic transcription challenging. We collect a corpus of roughly 62,000 manually transcribed radio transmissions (<sup>~</sup>46 hours of audio) to evaluate the feasibility of automatic speech recognition (ASR) using modern recognition models. We evaluate the performance of off-the-shelf speech recognizers, models fine-tuned on BPC data, and customized end-to-end models. We find that both human and machine transcription is challenging in this domain. Large off-the-shelf ASR models perform poorly, but fine-tuned models can reach the approximate range of human performance. Our work suggests directions for future work, including analysis of short utterances and potential miscommunication in police radio interactions. We make our corpus and data annotation pipeline available to other researchers, to enable further research on recognition and analysis of police communication.</p>\",\"PeriodicalId\":74811,\"journal\":{\"name\":\"SLT ... : ... IEEE Workshop on Spoken Language Technology : proceedings. IEEE Workshop on Spoken Language Technology\",\"volume\":\"2024 \",\"pages\":\"906-912\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12180137/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SLT ... : ... IEEE Workshop on Spoken Language Technology : proceedings. IEEE Workshop on Spoken Language Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/slt61566.2024.10832157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SLT ... : ... IEEE Workshop on Spoken Language Technology : proceedings. IEEE Workshop on Spoken Language Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/slt61566.2024.10832157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

世界各地的警察部门使用双向无线电进行协调。这些广播警察通信(BPC)是关于警察日常活动和应急反应的独特信息来源。然而,BPC是不转录的,其自然的音频属性使自动转录具有挑战性。我们收集了大约62,000个手动转录的无线电传输(约46小时的音频)的语料库,以评估使用现代识别模型进行自动语音识别(ASR)的可行性。我们评估了现成的语音识别器、基于BPC数据微调的模型和定制的端到端模型的性能。我们发现人类和机器转录在这个领域都是具有挑战性的。大型现成的ASR模型表现不佳,但经过微调的模型可以达到接近人类表现的范围。我们的工作为未来的工作指明了方向,包括分析警察无线电互动中的简短话语和潜在的误解。我们将我们的语料库和数据标注管道提供给其他研究人员,以进一步研究警察通信的识别和分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SPEECH RECOGNITION FOR ANALYSIS OF POLICE RADIO COMMUNICATION.

Police departments around the world use two-way radio for coordination. These broadcast police communications (BPC) are a unique source of information about everyday police activity and emergency response. Yet BPC are not transcribed, and their naturalistic audio properties make automatic transcription challenging. We collect a corpus of roughly 62,000 manually transcribed radio transmissions (~46 hours of audio) to evaluate the feasibility of automatic speech recognition (ASR) using modern recognition models. We evaluate the performance of off-the-shelf speech recognizers, models fine-tuned on BPC data, and customized end-to-end models. We find that both human and machine transcription is challenging in this domain. Large off-the-shelf ASR models perform poorly, but fine-tuned models can reach the approximate range of human performance. Our work suggests directions for future work, including analysis of short utterances and potential miscommunication in police radio interactions. We make our corpus and data annotation pipeline available to other researchers, to enable further research on recognition and analysis of police communication.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信