{"title":"面向多领域任务的对话策略优化研究进展与挑战","authors":"Mahdin Rohmatillah, Jen-Tzung Chien","doi":"10.1561/116.00000132","DOIUrl":null,"url":null,"abstract":"Developing a successful dialogue policy for a multi-domain task-oriented dialogue (MDTD) system is a challenging task. Basically, a desirable dialogue policy acts as the decision-making agent who understands the user’s intention to provide suitable responses within a short conversation. Furthermore, offering the precise answers to satisfy the user requirements makes the task even more challenging. This paper surveys recent advances in multi-domain task-oriented dialogue policy optimization and summarizes a number of solutions to policy learning. In particular, the case study on the task of travel assistance using the MDTD dataset based on MultiWOZ containing seven different domains is investigated. The dialogue policy optimization methods, categorized into dialogue act level and word level, are systematically presented. Moreover, this paper addresses a number of challenges and difficulties including the user simulator design and the dialogue policy evaluation which need to be resolved to further enhance the robustness and effectiveness in multi-domain dialogue policy representation. ∗Corresponding author: Jen-Tzung Chien, jtchien@nycu.edu.tw. Received 22 May 2023; Revised 20 July 2023 ISSN 2048-7703; DOI 10.1561/116.00000132 © 2023 M. Rohmatillah and J.-T. Chien 2 Rohmatillah and Chien","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advances and Challenges in Multi-Domain Task-Oriented Dialogue Policy Optimization\",\"authors\":\"Mahdin Rohmatillah, Jen-Tzung Chien\",\"doi\":\"10.1561/116.00000132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing a successful dialogue policy for a multi-domain task-oriented dialogue (MDTD) system is a challenging task. Basically, a desirable dialogue policy acts as the decision-making agent who understands the user’s intention to provide suitable responses within a short conversation. Furthermore, offering the precise answers to satisfy the user requirements makes the task even more challenging. This paper surveys recent advances in multi-domain task-oriented dialogue policy optimization and summarizes a number of solutions to policy learning. In particular, the case study on the task of travel assistance using the MDTD dataset based on MultiWOZ containing seven different domains is investigated. The dialogue policy optimization methods, categorized into dialogue act level and word level, are systematically presented. Moreover, this paper addresses a number of challenges and difficulties including the user simulator design and the dialogue policy evaluation which need to be resolved to further enhance the robustness and effectiveness in multi-domain dialogue policy representation. ∗Corresponding author: Jen-Tzung Chien, jtchien@nycu.edu.tw. Received 22 May 2023; Revised 20 July 2023 ISSN 2048-7703; DOI 10.1561/116.00000132 © 2023 M. Rohmatillah and J.-T. Chien 2 Rohmatillah and Chien\",\"PeriodicalId\":44812,\"journal\":{\"name\":\"APSIPA Transactions on Signal and Information Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"APSIPA Transactions on Signal and Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1561/116.00000132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"APSIPA Transactions on Signal and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1561/116.00000132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Advances and Challenges in Multi-Domain Task-Oriented Dialogue Policy Optimization
Developing a successful dialogue policy for a multi-domain task-oriented dialogue (MDTD) system is a challenging task. Basically, a desirable dialogue policy acts as the decision-making agent who understands the user’s intention to provide suitable responses within a short conversation. Furthermore, offering the precise answers to satisfy the user requirements makes the task even more challenging. This paper surveys recent advances in multi-domain task-oriented dialogue policy optimization and summarizes a number of solutions to policy learning. In particular, the case study on the task of travel assistance using the MDTD dataset based on MultiWOZ containing seven different domains is investigated. The dialogue policy optimization methods, categorized into dialogue act level and word level, are systematically presented. Moreover, this paper addresses a number of challenges and difficulties including the user simulator design and the dialogue policy evaluation which need to be resolved to further enhance the robustness and effectiveness in multi-domain dialogue policy representation. ∗Corresponding author: Jen-Tzung Chien, jtchien@nycu.edu.tw. Received 22 May 2023; Revised 20 July 2023 ISSN 2048-7703; DOI 10.1561/116.00000132 © 2023 M. Rohmatillah and J.-T. Chien 2 Rohmatillah and Chien