面向空中交通管制领域的上下文感知语音识别与理解系统

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2017-12-01 DOI:10.1109/ASRU.2017.8268964

Youssef Oualil, D. Klakow, György Szaszák, A. Srinivasamurthy, H. Helmke, P. Motlícek

{"title":"面向空中交通管制领域的上下文感知语音识别与理解系统","authors":"Youssef Oualil, D. Klakow, György Szaszák, A. Srinivasamurthy, H. Helmke, P. Motlícek","doi":"10.1109/ASRU.2017.8268964","DOIUrl":null,"url":null,"abstract":"Automatic Speech Recognition and Understanding (ASRU) systems can generally use temporal and situational context information to improve their performance for a given task. This is typically done by rescoring the ASR hypotheses or by dynamically adapting the ASR models. For some domains, such as Air Traffic Control (ATC), this context information can be, however, small in size, partial and available only as abstract concepts (e.g. airline codes), which are difficult to map into full possible spoken sentences to perform rescoring or adaptation. This paper presents a multi-modal ASRU system, which dynamically integrates partial temporal and situational ATC context information to improve its performance. This is done either by 1) extracting word sequences which carry relevant ATC information from ASR N-best Lists and then perform a context-based rescoring on the extracted ATC segments or 2) by a partial adaptation of the language model. Experiments conducted on 4 hours of test data from Prague and Vienna approach (arrivals) showed a relative reduction of the ATC command error rate metric by 30% to 50%.","PeriodicalId":290868,"journal":{"name":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"A context-aware speech recognition and understanding system for air traffic control domain\",\"authors\":\"Youssef Oualil, D. Klakow, György Szaszák, A. Srinivasamurthy, H. Helmke, P. Motlícek\",\"doi\":\"10.1109/ASRU.2017.8268964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic Speech Recognition and Understanding (ASRU) systems can generally use temporal and situational context information to improve their performance for a given task. This is typically done by rescoring the ASR hypotheses or by dynamically adapting the ASR models. For some domains, such as Air Traffic Control (ATC), this context information can be, however, small in size, partial and available only as abstract concepts (e.g. airline codes), which are difficult to map into full possible spoken sentences to perform rescoring or adaptation. This paper presents a multi-modal ASRU system, which dynamically integrates partial temporal and situational ATC context information to improve its performance. This is done either by 1) extracting word sequences which carry relevant ATC information from ASR N-best Lists and then perform a context-based rescoring on the extracted ATC segments or 2) by a partial adaptation of the language model. Experiments conducted on 4 hours of test data from Prague and Vienna approach (arrivals) showed a relative reduction of the ATC command error rate metric by 30% to 50%.\",\"PeriodicalId\":290868,\"journal\":{\"name\":\"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2017.8268964\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2017.8268964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

自动语音识别和理解(ASRU)系统通常可以使用时间和情景上下文信息来提高给定任务的性能。这通常是通过重新记录ASR假设或动态调整ASR模型来完成的。然而，对于某些领域，例如空中交通管制(ATC)，这些上下文信息可能很小，部分且仅作为抽象概念(例如航空公司代码)可用，很难将其映射到完整的可能口语句子中以执行记录或适应。本文提出了一种多模态ASRU系统，该系统动态集成了部分时间和情景ATC上下文信息，以提高系统性能。这可以通过1)从ASR N-best Lists中提取携带相关ATC信息的单词序列，然后对提取的ATC片段执行基于上下文的评分，或者2)通过部分适应语言模型来完成。在布拉格和维也纳进近(到达)的4小时测试数据上进行的实验显示ATC命令错误率指标相对减少30%到50%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A context-aware speech recognition and understanding system for air traffic control domain

Automatic Speech Recognition and Understanding (ASRU) systems can generally use temporal and situational context information to improve their performance for a given task. This is typically done by rescoring the ASR hypotheses or by dynamically adapting the ASR models. For some domains, such as Air Traffic Control (ATC), this context information can be, however, small in size, partial and available only as abstract concepts (e.g. airline codes), which are difficult to map into full possible spoken sentences to perform rescoring or adaptation. This paper presents a multi-modal ASRU system, which dynamically integrates partial temporal and situational ATC context information to improve its performance. This is done either by 1) extracting word sequences which carry relevant ATC information from ASR N-best Lists and then perform a context-based rescoring on the extracted ATC segments or 2) by a partial adaptation of the language model. Experiments conducted on 4 hours of test data from Prague and Vienna approach (arrivals) showed a relative reduction of the ATC command error rate metric by 30% to 50%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量