Knowledge-Enriched Moral Understanding upon Continual Pre-training

Jing Qian, Yong Yue, Katie Atkinson, Gangmin Li
{"title":"Knowledge-Enriched Moral Understanding upon Continual Pre-training","authors":"Jing Qian, Yong Yue, Katie Atkinson, Gangmin Li","doi":"10.5121/csit.2023.130414","DOIUrl":null,"url":null,"abstract":"The aim of moral understanding is to comprehend the abstract concepts that hide in a story by seeing through concrete events and vivid characters. To be specific, the story is highly summarized in one sentence without covering any characters in the original story, which requires the machine to behave more intelligently with the abilities of moral perception and commonsense reasoning. The paradigm of “pre-training + fine-tuning” is generally accepted for applying neural language models. In this paper, we suggest adding an intermediate stage to build the flow of “pre-training + continual pre-training + finetuning”. Continual pre-training refers to further training on task-relevant or domainspecific corpora with the aim of bridging the data distribution gap between pre-training and fine-tuning. Experiments are basing on a new moral story dataset, STORAL-ZH, that composes of 4,209 Chinese story-moral pairs. We collect a moral corpus about Confucius theory to enrich the T5 model with moral knowledge. Furthermore, we leverage a Chinese commonsense knowledge graph to enhance the model with commonsense knowledge. Experimental results demonstrate the effectiveness of our method, compared with several state-of-the-art models including BERT-base, RoBERTa-base and T5-base.","PeriodicalId":159989,"journal":{"name":"Computer Networks & Communications","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks & Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2023.130414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The aim of moral understanding is to comprehend the abstract concepts that hide in a story by seeing through concrete events and vivid characters. To be specific, the story is highly summarized in one sentence without covering any characters in the original story, which requires the machine to behave more intelligently with the abilities of moral perception and commonsense reasoning. The paradigm of “pre-training + fine-tuning” is generally accepted for applying neural language models. In this paper, we suggest adding an intermediate stage to build the flow of “pre-training + continual pre-training + finetuning”. Continual pre-training refers to further training on task-relevant or domainspecific corpora with the aim of bridging the data distribution gap between pre-training and fine-tuning. Experiments are basing on a new moral story dataset, STORAL-ZH, that composes of 4,209 Chinese story-moral pairs. We collect a moral corpus about Confucius theory to enrich the T5 model with moral knowledge. Furthermore, we leverage a Chinese commonsense knowledge graph to enhance the model with commonsense knowledge. Experimental results demonstrate the effectiveness of our method, compared with several state-of-the-art models including BERT-base, RoBERTa-base and T5-base.
基于持续预训练的知识丰富的道德理解
道德理解的目的是通过具体的事件和生动的人物来理解隐藏在故事中的抽象概念。具体来说,故事高度概括在一句话中,没有涉及到原故事中的任何人物,这就要求机器的行为更加智能,具备道德感知和常识推理的能力。神经语言模型的应用一般采用“预训练+微调”的模式。在本文中,我们建议增加一个中间阶段,构建“预训练+持续预训练+微调”的流程。持续预训练是指在任务相关或特定领域的语料库上进行进一步的训练,目的是弥合预训练和微调之间的数据分布差距。实验基于一个新的道德故事数据集STORAL-ZH,该数据集由4209个中国故事-道德对组成。我们收集了有关孔子理论的道德语料库,用道德知识来丰富T5模型。此外,我们还利用中文常识图来增强模型的常识知识。通过与BERT-base、RoBERTa-base和T5-base等几种最先进的模型进行比较,实验结果证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信