Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities

2020 IEEE International Conference on Prognostics and Health Management (ICPHM) Pub Date : 2019-12-31 DOI:10.1109/ICPHM49022.2020.9187036

W. Shalaby, Adriano Arantes, Teresa GonzalezDiaz, Chetan Gupta

{"title":"Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities","authors":"W. Shalaby, Adriano Arantes, Teresa GonzalezDiaz, Chetan Gupta","doi":"10.1109/ICPHM49022.2020.9187036","DOIUrl":null,"url":null,"abstract":"Popular conversational agents frameworks such as Alexa Skills Kit (ASK) and Google Actions (gActions) offer unprecedented opportunities for facilitating the development and deployment of voice-enabled AI solutions in various verticals. Nevertheless, understanding user utterances with high accuracy remains a challenging task with these frameworks. Particularly, when building chatbots with large volume of domain-specific entities. In this paper, we describe the challenges and lessons learned from building a large scale virtual assistant for understanding and responding to equipment-related complaints. In the process, we describe an alternative scalable framework for: 1) extracting the knowledge about equipment components and their associated problem entities from short texts, and 2) learning to identify such entities in user utterances. We show through evaluation on a real dataset that the proposed framework, compared to off-the-shelf popular ones, scales better with large volume of entities being up to 30% more accurate, and is more effective in understanding user utterances with domain-specific entities.","PeriodicalId":148899,"journal":{"name":"2020 IEEE International Conference on Prognostics and Health Management (ICPHM)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Prognostics and Health Management (ICPHM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPHM49022.2020.9187036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Popular conversational agents frameworks such as Alexa Skills Kit (ASK) and Google Actions (gActions) offer unprecedented opportunities for facilitating the development and deployment of voice-enabled AI solutions in various verticals. Nevertheless, understanding user utterances with high accuracy remains a challenging task with these frameworks. Particularly, when building chatbots with large volume of domain-specific entities. In this paper, we describe the challenges and lessons learned from building a large scale virtual assistant for understanding and responding to equipment-related complaints. In the process, we describe an alternative scalable framework for: 1) extracting the knowledge about equipment components and their associated problem entities from short texts, and 2) learning to identify such entities in user utterances. We show through evaluation on a real dataset that the proposed framework, compared to off-the-shelf popular ones, scales better with large volume of entities being up to 30% more accurate, and is more effective in understanding user utterances with domain-specific entities.

查看原文本刊更多论文

从大规模特定领域知识库构建聊天机器人:挑战和机遇

流行的会话代理框架，如Alexa Skills Kit (ASK)和Google Actions (gActions)，为促进语音AI解决方案在各个垂直领域的开发和部署提供了前所未有的机会。然而，在这些框架中，高精度地理解用户话语仍然是一项具有挑战性的任务。特别是在构建具有大量特定领域实体的聊天机器人时。在本文中，我们描述了从构建大型虚拟助手以理解和响应设备相关投诉中所面临的挑战和吸取的教训。在此过程中，我们描述了一个可扩展的框架:1)从短文本中提取有关设备组件及其相关问题实体的知识，以及2)学习在用户话语中识别这些实体。我们通过对真实数据集的评估表明，与现成的流行框架相比，所提出的框架在大量实体的情况下可以更好地扩展，准确率提高30%，并且在理解特定领域实体的用户话语方面更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Conference on Prognostics and Health Management (ICPHM)

自引率

0.00%

发文量