Anastasia Mishakova, François Portet, Thierry Desot, Michel Vacher
{"title":"Learning Natural Language Understanding Systems from Unaligned Labels for Voice Command in Smart Homes","authors":"Anastasia Mishakova, François Portet, Thierry Desot, Michel Vacher","doi":"10.1109/PERCOMW.2019.8730721","DOIUrl":null,"url":null,"abstract":"Voice command smart home systems have become a target for the industry to provide more natural human computer interaction. To interpret voice command, systems must be able to extract the meaning from natural language; this task is called Natural Language Understanding (NLU). Modern NLU is based on statistical models which are trained on data. However, a current limitation of most NLU statistical models is the dependence on large amount of textual data aligned with target semantic labels. This is highly time-consuming. Moreover, they require training several separate models for predicting intents, slot-labels and slot-values. In this paper, we propose to use a sequence-to-sequence neural architecture to train NLU models which do not need aligned data and can jointly learn the intent, slot-label and slot-value prediction tasks. This approach has been evaluated both on a voice command dataset we acquired for the purpose of the study as well as on a publicly available dataset. The experiments show that a single model learned on unaligned data is competitive with state-of-the-art models which depend on aligned data.","PeriodicalId":437017,"journal":{"name":"2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PERCOMW.2019.8730721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Voice command smart home systems have become a target for the industry to provide more natural human computer interaction. To interpret voice command, systems must be able to extract the meaning from natural language; this task is called Natural Language Understanding (NLU). Modern NLU is based on statistical models which are trained on data. However, a current limitation of most NLU statistical models is the dependence on large amount of textual data aligned with target semantic labels. This is highly time-consuming. Moreover, they require training several separate models for predicting intents, slot-labels and slot-values. In this paper, we propose to use a sequence-to-sequence neural architecture to train NLU models which do not need aligned data and can jointly learn the intent, slot-label and slot-value prediction tasks. This approach has been evaluated both on a voice command dataset we acquired for the purpose of the study as well as on a publicly available dataset. The experiments show that a single model learned on unaligned data is competitive with state-of-the-art models which depend on aligned data.