Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering最新文献

Enhanced Training Methods for Multiple Languages 多语言强化训练方法

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.6

Hai Li, Y. Li

引用次数: 0

MoQA: Benchmarking Multi-Type Open-Domain Question Answering MoQA:对标多类型开放领域问答

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.2

Ho-Ching Yen, Tianyu Gao, Jinhyuk Lee, Danqi Chen

{"title":"MoQA: Benchmarking Multi-Type Open-Domain Question Answering","authors":"Ho-Ching Yen, Tianyu Gao, Jinhyuk Lee, Danqi Chen","doi":"10.18653/v1/2023.dialdoc-1.2","DOIUrl":"https://doi.org/10.18653/v1/2023.dialdoc-1.2","url":null,"abstract":"Previous research on open-domain question answering (QA) mainly focuses on questions with short answers. However, information-seeking QA often requires various formats of answers depending on the nature of the questions, e.g., why/how questions typically require a long answer. In this paper, we present MoQA, a benchmark for open-domain QA that requires building one system that can provide short, medium, long, and yes/no answers to different questions accordingly. MoQA builds upon Natural Questions with multiple types of questions and additional crowdsourcing efforts to ensure high query quality. We adapt state-of-the-art models, and reveal unique findings in multi-type open-domain QA: (1) For retriever-reader models, training one retriever on all types achieves the overall best performance, but it is challenging to train one reader model to output answers of different formats, or to train a question classifier to distinguish between types; (2) An end-to-end closed-book QA model trained on multiple types struggles with the task across the board; (3) State-of-the-art large language models such as the largest GPT-3 models (Brown et al., 2020; Ouyang et al., 2022) also lag behind open-book QA models. Our benchmark and analysis call for more effort into building versatile open-domain QA models in the future.","PeriodicalId":190893,"journal":{"name":"Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116433154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Language-Agnostic Transformers and Assessing ChatGPT-Based Query Rewriting for Multilingual Document-Grounded QA 基于chatgpt的多语言文档QA查询重写的语言不可知转换和评估

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.11

Srinivas Gowriraj, Soham Dinesh Tiwari, Mitali Potnis, Srijan Bansal, T. Mitamura, Eric Nyberg

引用次数: 1

Exploration of multilingual prompts in document-grounded dialogue 基于文档的对话中多语言提示的探索

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.3

Xiaochen Zhang, Huang Qing, Fu Lin

引用次数: 0

SLDT: Sequential Latent Document Transformer for Multilingual Document-based Dialogue 基于多语言文档对话的顺序潜在文档转换器

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.7

Zhanyu Ma, Zeming Liu, Jian Ye

引用次数: 1

Follow the Knowledge: Structural Biases and Artefacts in Knowledge Grounded Dialog Datasets 遵循知识:基于知识的对话数据集中的结构偏差和人工制品

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.12

Ehsan Lotfi, Maxime De Bruyn, Jeska Buhmann, Walter Daelemans

{"title":"Follow the Knowledge: Structural Biases and Artefacts in Knowledge Grounded Dialog Datasets","authors":"Ehsan Lotfi, Maxime De Bruyn, Jeska Buhmann, Walter Daelemans","doi":"10.18653/v1/2023.dialdoc-1.12","DOIUrl":"https://doi.org/10.18653/v1/2023.dialdoc-1.12","url":null,"abstract":"Crowd-sourcing has been one of the primary ways to curate conversational data, specially for certain scenarios like grounding in knowledge. In this setting, using online platforms like AMT, non-expert participants are hired to converse with each other, following instructions which try to guide the outcome towards the desired format. The resulting data then is used for different parts of dialog modelling like knowledge selection and response selection/generation.In this work, we take a closer look into two of the most popular knowledge grounded dialog (KGD) datasets. Investigating potential biases and artefacts in knowledge selection labels, we observe that in many cases the ‘knowledge selection flow’ simply follows the order of presented knowledge pieces. In Wizard of Wikipedia (the most popular KGD dataset) we use simple content-agnostic models based on this bias to get significant knowledge selection performance. In Topical-Chat we see a similar correlation between the knowledge selection sequence and the order of entities and their segments, as provided to crowd-source workers. We believe that the observed results, question the significance and origin of the presumed dialog-level attributes like ‘knowledge flow’ in these crowd-sourced datasets.","PeriodicalId":190893,"journal":{"name":"Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115411604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Multilingual Document-Grounded Dialogue Using Cascaded Prompt-Based Post-Training Models 使用层叠式基于提示的训练后模型加强多语言文档对话

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.5

Jun Liu, Shuang Cheng, Zineng Zhou, Yang Gu, Jian Ye, Haiyong Luo

{"title":"Enhancing Multilingual Document-Grounded Dialogue Using Cascaded Prompt-Based Post-Training Models","authors":"Jun Liu, Shuang Cheng, Zineng Zhou, Yang Gu, Jian Ye, Haiyong Luo","doi":"10.18653/v1/2023.dialdoc-1.5","DOIUrl":"https://doi.org/10.18653/v1/2023.dialdoc-1.5","url":null,"abstract":"The Dialdoc23 shared task presents a Multilingual Document-Grounded Dialogue Systems (MDGDS) challenge, where system responses are generated in multiple languages using user’s queries, historical dialogue records and relevant passages. A major challenge for this task is the limited training data available in low-resource languages such as French and Vietnamese. In this paper, we propose Cascaded Prompt-based Post-training Models, dividing the task into three subtasks: Retrieval, Reranking and Generation. We conduct post-training on high-resource language such as English and Chinese to enhance performance of low-resource languages by using the similarities of languages. Additionally, we utilize the prompt method to activate model’s ability on diverse languages within the dialogue domain and explore which prompt is a good prompt. Our comprehensive experiments demonstrate the effectiveness of our proposed methods, which achieved the first place on the leaderboard with a total score of 215.40 in token-level F1, SacreBleu, and Rouge-L metrics.","PeriodicalId":190893,"journal":{"name":"Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122104911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering ConvRGX:自训练会话问答的识别、生成和提取

Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.dialdoc-1.10

Tianhua Zhang, Liping Tang, Wei Fang, Hongyin Luo, Xixin Wu, H. Meng, James R. Glass

引用次数: 0