Research on Chinese Legal Document Reading Comprehension Based on Pre-Training Model

Lufeng Yuan, Wei Zhang, Xiaoxin Gao, Linlin Zhao, Bin Liu, Maokai Liu
{"title":"Research on Chinese Legal Document Reading Comprehension Based on Pre-Training Model","authors":"Lufeng Yuan, Wei Zhang, Xiaoxin Gao, Linlin Zhao, Bin Liu, Maokai Liu","doi":"10.1145/3487075.3487157","DOIUrl":null,"url":null,"abstract":"We research how to read and understand Chinese legal documents. At first, we analyze the difficulties of reading comprehension of Chinese legal documents. Data imbalance exists seriously among span extraction query, yes/no query and unanswerable query, that is span extraction queries account for more than 80%. The reading comprehension of Chinese legal documents is a typical long text reading problem. Then we propose a framework for reading and understanding Chinese legal documents. Based on the Bert pre-training model, the framework performs fine-tine for Chinese legal documents, adopts a variety of deep learning models, and uses data enhancement and ensemble strategy to solve reading comprehension of Chinese legal documents. Finally, we test the framework with real legal documents, and the macro average F value can reach 82.773.","PeriodicalId":354966,"journal":{"name":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","volume":"253 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487075.3487157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We research how to read and understand Chinese legal documents. At first, we analyze the difficulties of reading comprehension of Chinese legal documents. Data imbalance exists seriously among span extraction query, yes/no query and unanswerable query, that is span extraction queries account for more than 80%. The reading comprehension of Chinese legal documents is a typical long text reading problem. Then we propose a framework for reading and understanding Chinese legal documents. Based on the Bert pre-training model, the framework performs fine-tine for Chinese legal documents, adopts a variety of deep learning models, and uses data enhancement and ensemble strategy to solve reading comprehension of Chinese legal documents. Finally, we test the framework with real legal documents, and the macro average F value can reach 82.773.
基于预训练模型的中文法律文书阅读理解研究
我们研究如何阅读和理解中国法律文件。首先,我们分析了中文法律文书阅读理解的难点。跨度抽取查询、是/否查询和不可回答查询之间存在严重的数据不平衡,即跨度抽取查询占查询总数的80%以上。中文法律文书的阅读理解是一个典型的长文本阅读问题。然后,我们提出了一个阅读和理解中国法律文件的框架。该框架以Bert预训练模型为基础,对中文法律文件进行细时间化处理,采用多种深度学习模型,采用数据增强和集成策略解决中文法律文件的阅读理解问题。最后用实际法律文件对框架进行检验,宏观平均F值可以达到82.773。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信