端到端命名实体与语音语义概念提取

2018 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2018-12-01 DOI:10.1109/SLT.2018.8639513

Sahar Ghannay, Antoine Caubrière, Y. Estève, Nathalie Camelin, E. Simonnet, Antoine Laurent, E. Morin

{"title":"端到端命名实体与语音语义概念提取","authors":"Sahar Ghannay, Antoine Caubrière, Y. Estève, Nathalie Camelin, E. Simonnet, Antoine Laurent, E. Morin","doi":"10.1109/SLT.2018.8639513","DOIUrl":null,"url":null,"abstract":"Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs. Such approach has some disadvantages (error propagation, metric to tune ASR systems sub-optimal in regards to the final task, reduced space search at the ASR output level,...) and it is known that more integrated approaches outperform sequential ones, when they can be applied. In this paper, we explore an end-to-end approach that directly extracts named entities from speech, though a unique neural architecture. On a such way, a joint optimization is possible for both ASR and NER. Experiments are carried on French data easily accessible, composed of data distributed in several evaluation campaigns. The results are promising since this end-to-end approach provides similar results (F-measure= 0.66 on test data) than a classical pipeline approach to detect named entity categories (F-measure=0.64). Last, we also explore this approach applied to semantic concept extraction, through a slot filling task known as a spoken language understanding problem, and also observe an improvement in comparison to a pipeline approach.","PeriodicalId":377307,"journal":{"name":"2018 IEEE Spoken Language Technology Workshop (SLT)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":"{\"title\":\"End-To-End Named Entity And Semantic Concept Extraction From Speech\",\"authors\":\"Sahar Ghannay, Antoine Caubrière, Y. Estève, Nathalie Camelin, E. Simonnet, Antoine Laurent, E. Morin\",\"doi\":\"10.1109/SLT.2018.8639513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs. Such approach has some disadvantages (error propagation, metric to tune ASR systems sub-optimal in regards to the final task, reduced space search at the ASR output level,...) and it is known that more integrated approaches outperform sequential ones, when they can be applied. In this paper, we explore an end-to-end approach that directly extracts named entities from speech, though a unique neural architecture. On a such way, a joint optimization is possible for both ASR and NER. Experiments are carried on French data easily accessible, composed of data distributed in several evaluation campaigns. The results are promising since this end-to-end approach provides similar results (F-measure= 0.66 on test data) than a classical pipeline approach to detect named entity categories (F-measure=0.64). Last, we also explore this approach applied to semantic concept extraction, through a slot filling task known as a spoken language understanding problem, and also observe an improvement in comparison to a pipeline approach.\",\"PeriodicalId\":377307,\"journal\":{\"name\":\"2018 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"138 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"75\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2018.8639513\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2018.8639513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 75

摘要

命名实体识别(NER)是通常从文本文档中提取语义信息的SLU任务之一。到目前为止，语音识别是通过一个流水线过程进行的，该过程包括首先在音频上处理自动语音识别(ASR)，然后在ASR输出上处理NER。这种方法有一些缺点(误差传播，调整ASR系统的度量在最终任务方面不是最优的，在ASR输出级别减少空间搜索，……)，并且众所周知，当可以应用时，更集成的方法优于顺序方法。在本文中，我们探索了一种端到端方法，通过独特的神经结构直接从语音中提取命名实体。通过这种方式，ASR和NER的联合优化是可能的。实验是在法国容易获得的数据上进行的，这些数据由在几个评价活动中分发的数据组成。结果是有希望的，因为这种端到端方法提供了与传统管道方法相似的结果(测试数据上的F-measure= 0.66)来检测命名实体类别(F-measure=0.64)。最后，我们还探索了这种方法在语义概念提取中的应用，通过一个被称为口语理解问题的槽填充任务，并观察到与管道方法相比的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

End-To-End Named Entity And Semantic Concept Extraction From Speech

Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs. Such approach has some disadvantages (error propagation, metric to tune ASR systems sub-optimal in regards to the final task, reduced space search at the ASR output level,...) and it is known that more integrated approaches outperform sequential ones, when they can be applied. In this paper, we explore an end-to-end approach that directly extracts named entities from speech, though a unique neural architecture. On a such way, a joint optimization is possible for both ASR and NER. Experiments are carried on French data easily accessible, composed of data distributed in several evaluation campaigns. The results are promising since this end-to-end approach provides similar results (F-measure= 0.66 on test data) than a classical pipeline approach to detect named entity categories (F-measure=0.64). Last, we also explore this approach applied to semantic concept extraction, through a slot filling task known as a spoken language understanding problem, and also observe an improvement in comparison to a pipeline approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量