{"title":"多文档接地多轮合成对话生成","authors":"Young-Suk Lee, Chulaka Gunasekara, Danish Contractor, Ramón Fernandez Astudillo, Radu Florian","doi":"arxiv-2409.11500","DOIUrl":null,"url":null,"abstract":"We introduce a technique for multi-document grounded multi-turn synthetic\ndialog generation that incorporates three main ideas. First, we control the\noverall dialog flow using taxonomy-driven user queries that are generated with\nChain-of-Thought (CoT) prompting. Second, we support the generation of\nmulti-document grounded dialogs by mimicking real-world use of retrievers to\nupdate the grounding documents after every user-turn in the dialog. Third, we\napply LLM-as-a-Judge to filter out queries with incorrect answers. Human\nevaluation of the synthetic dialog data suggests that the data is diverse,\ncoherent, and includes mostly correct answers. Both human and automatic\nevaluations of answerable queries indicate that models fine-tuned on synthetic\ndialogs consistently out-perform those fine-tuned on existing human generated\ntraining data across four publicly available multi-turn document grounded\nbenchmark test sets.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Document Grounded Multi-Turn Synthetic Dialog Generation\",\"authors\":\"Young-Suk Lee, Chulaka Gunasekara, Danish Contractor, Ramón Fernandez Astudillo, Radu Florian\",\"doi\":\"arxiv-2409.11500\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a technique for multi-document grounded multi-turn synthetic\\ndialog generation that incorporates three main ideas. First, we control the\\noverall dialog flow using taxonomy-driven user queries that are generated with\\nChain-of-Thought (CoT) prompting. Second, we support the generation of\\nmulti-document grounded dialogs by mimicking real-world use of retrievers to\\nupdate the grounding documents after every user-turn in the dialog. Third, we\\napply LLM-as-a-Judge to filter out queries with incorrect answers. Human\\nevaluation of the synthetic dialog data suggests that the data is diverse,\\ncoherent, and includes mostly correct answers. Both human and automatic\\nevaluations of answerable queries indicate that models fine-tuned on synthetic\\ndialogs consistently out-perform those fine-tuned on existing human generated\\ntraining data across four publicly available multi-turn document grounded\\nbenchmark test sets.\",\"PeriodicalId\":501030,\"journal\":{\"name\":\"arXiv - CS - Computation and Language\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computation and Language\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11500\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We introduce a technique for multi-document grounded multi-turn synthetic
dialog generation that incorporates three main ideas. First, we control the
overall dialog flow using taxonomy-driven user queries that are generated with
Chain-of-Thought (CoT) prompting. Second, we support the generation of
multi-document grounded dialogs by mimicking real-world use of retrievers to
update the grounding documents after every user-turn in the dialog. Third, we
apply LLM-as-a-Judge to filter out queries with incorrect answers. Human
evaluation of the synthetic dialog data suggests that the data is diverse,
coherent, and includes mostly correct answers. Both human and automatic
evaluations of answerable queries indicate that models fine-tuned on synthetic
dialogs consistently out-perform those fine-tuned on existing human generated
training data across four publicly available multi-turn document grounded
benchmark test sets.