Learning to Generate Equitable Text in Dialogue from Biased Training Data

Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-07-10 DOI:10.48550/arXiv.2307.04303

Anthony Sicilia, Malihe Alikhani

{"title":"Learning to Generate Equitable Text in Dialogue from Biased Training Data","authors":"Anthony Sicilia, Malihe Alikhani","doi":"10.48550/arXiv.2307.04303","DOIUrl":null,"url":null,"abstract":"The ingrained principles of fairness in a dialogue system’s decision-making process and generated responses are crucial for user engagement, satisfaction, and task achievement. Absence of equitable and inclusive principles can hinder the formation of common ground, which in turn negatively impacts the overall performance of the system. For example, misusing pronouns in a user interaction may cause ambiguity about the intended subject. Yet, there is no comprehensive study of equitable text generation in dialogue. Aptly, in this work, we use theories of computational learning to study this problem. We provide formal definitions of equity in text generation, and further, prove formal connections between learning human-likeness and learning equity: algorithms for improving equity ultimately reduce to algorithms for improving human-likeness (on augmented data). With this insight, we also formulate reasonable conditions under which text generation algorithms can learn to generate equitable text without any modifications to the biased training data on which they learn. To exemplify our theory in practice, we look at a group of algorithms for the GuessWhat?! visual dialogue game and, using this example, test our theory empirically. Our theory accurately predicts relative-performance of multiple algorithms in generating equitable text as measured by both human and automated evaluation.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Meeting of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2307.04303","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The ingrained principles of fairness in a dialogue system’s decision-making process and generated responses are crucial for user engagement, satisfaction, and task achievement. Absence of equitable and inclusive principles can hinder the formation of common ground, which in turn negatively impacts the overall performance of the system. For example, misusing pronouns in a user interaction may cause ambiguity about the intended subject. Yet, there is no comprehensive study of equitable text generation in dialogue. Aptly, in this work, we use theories of computational learning to study this problem. We provide formal definitions of equity in text generation, and further, prove formal connections between learning human-likeness and learning equity: algorithms for improving equity ultimately reduce to algorithms for improving human-likeness (on augmented data). With this insight, we also formulate reasonable conditions under which text generation algorithms can learn to generate equitable text without any modifications to the biased training data on which they learn. To exemplify our theory in practice, we look at a group of algorithms for the GuessWhat?! visual dialogue game and, using this example, test our theory empirically. Our theory accurately predicts relative-performance of multiple algorithms in generating equitable text as measured by both human and automated evaluation.

查看原文本刊更多论文

学习从有偏见的训练数据中生成公平的对话文本

对话系统决策过程中根深蒂固的公平原则和生成的响应对于用户粘性、满意度和任务成就至关重要。缺乏公平和包容的原则可能会阻碍共同基础的形成，从而对该系统的整体绩效产生负面影响。例如，在用户交互中误用代词可能会导致预期主题的歧义。然而，目前还没有对对话中公平文本生成进行全面的研究。在这项工作中，我们使用计算学习的理论来研究这个问题。我们提供了文本生成中公平的正式定义，并进一步证明了学习人类相似度和学习公平之间的正式联系:改善公平的算法最终简化为改善人类相似度的算法(在增强数据上)。有了这一见解，我们还制定了合理的条件，在这些条件下，文本生成算法可以学习生成公平的文本，而不需要对它们学习的有偏见的训练数据进行任何修改。为了在实践中举例说明我们的理论，我们来看看GuessWhat?!视觉对话游戏，用这个例子来检验我们的理论。我们的理论准确地预测了多种算法在生成公平文本方面的相对性能，这些算法通过人工评估和自动评估来衡量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Meeting of the Association for Computational Linguistics

自引率

0.00%

发文量