Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, Vera Schmitt
{"title":"交叉定义:通过串联学习改进自然语言解释生成","authors":"Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, Vera Schmitt","doi":"arxiv-2409.07123","DOIUrl":null,"url":null,"abstract":"Natural language explanations (NLEs) are vital for elucidating the reasoning\nbehind large language model (LLM) decisions. Many techniques have been\ndeveloped to generate NLEs using LLMs. However, like humans, LLMs might not\nalways produce optimal NLEs on first attempt. Inspired by human learning\nprocesses, we introduce Cross-Refine, which employs role modeling by deploying\ntwo LLMs as generator and critic, respectively. The generator outputs a first\nNLE and then refines this initial explanation using feedback and suggestions\nprovided by the critic. Cross-Refine does not require any supervised training\ndata or additional training. We validate Cross-Refine across three NLP tasks\nusing three state-of-the-art open-source LLMs through automatic and human\nevaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which\nonly utilizes self-feedback to refine the explanations. Our findings from\nautomatic evaluation and a user study indicate that Cross-Refine outperforms\nSelf-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful\nLLMs, whereas Self-Refine only yields strong results with ChatGPT.\nAdditionally, we conduct an ablation study to assess the importance of feedback\nand suggestions. Both of them play an important role in refining explanations.\nWe further evaluate Cross-Refine on a bilingual dataset in English and German.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem\",\"authors\":\"Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, Vera Schmitt\",\"doi\":\"arxiv-2409.07123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural language explanations (NLEs) are vital for elucidating the reasoning\\nbehind large language model (LLM) decisions. Many techniques have been\\ndeveloped to generate NLEs using LLMs. However, like humans, LLMs might not\\nalways produce optimal NLEs on first attempt. Inspired by human learning\\nprocesses, we introduce Cross-Refine, which employs role modeling by deploying\\ntwo LLMs as generator and critic, respectively. The generator outputs a first\\nNLE and then refines this initial explanation using feedback and suggestions\\nprovided by the critic. Cross-Refine does not require any supervised training\\ndata or additional training. We validate Cross-Refine across three NLP tasks\\nusing three state-of-the-art open-source LLMs through automatic and human\\nevaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which\\nonly utilizes self-feedback to refine the explanations. Our findings from\\nautomatic evaluation and a user study indicate that Cross-Refine outperforms\\nSelf-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful\\nLLMs, whereas Self-Refine only yields strong results with ChatGPT.\\nAdditionally, we conduct an ablation study to assess the importance of feedback\\nand suggestions. Both of them play an important role in refining explanations.\\nWe further evaluate Cross-Refine on a bilingual dataset in English and German.\",\"PeriodicalId\":501030,\"journal\":{\"name\":\"arXiv - CS - Computation and Language\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computation and Language\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07123\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem
Natural language explanations (NLEs) are vital for elucidating the reasoning
behind large language model (LLM) decisions. Many techniques have been
developed to generate NLEs using LLMs. However, like humans, LLMs might not
always produce optimal NLEs on first attempt. Inspired by human learning
processes, we introduce Cross-Refine, which employs role modeling by deploying
two LLMs as generator and critic, respectively. The generator outputs a first
NLE and then refines this initial explanation using feedback and suggestions
provided by the critic. Cross-Refine does not require any supervised training
data or additional training. We validate Cross-Refine across three NLP tasks
using three state-of-the-art open-source LLMs through automatic and human
evaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which
only utilizes self-feedback to refine the explanations. Our findings from
automatic evaluation and a user study indicate that Cross-Refine outperforms
Self-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful
LLMs, whereas Self-Refine only yields strong results with ChatGPT.
Additionally, we conduct an ablation study to assess the importance of feedback
and suggestions. Both of them play an important role in refining explanations.
We further evaluate Cross-Refine on a bilingual dataset in English and German.