Lufeng Yuan, Wei Zhang, Xiaoxin Gao, Linlin Zhao, Bin Liu, Maokai Liu
{"title":"基于预训练模型的中文法律文书阅读理解研究","authors":"Lufeng Yuan, Wei Zhang, Xiaoxin Gao, Linlin Zhao, Bin Liu, Maokai Liu","doi":"10.1145/3487075.3487157","DOIUrl":null,"url":null,"abstract":"We research how to read and understand Chinese legal documents. At first, we analyze the difficulties of reading comprehension of Chinese legal documents. Data imbalance exists seriously among span extraction query, yes/no query and unanswerable query, that is span extraction queries account for more than 80%. The reading comprehension of Chinese legal documents is a typical long text reading problem. Then we propose a framework for reading and understanding Chinese legal documents. Based on the Bert pre-training model, the framework performs fine-tine for Chinese legal documents, adopts a variety of deep learning models, and uses data enhancement and ensemble strategy to solve reading comprehension of Chinese legal documents. Finally, we test the framework with real legal documents, and the macro average F value can reach 82.773.","PeriodicalId":354966,"journal":{"name":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","volume":"253 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Chinese Legal Document Reading Comprehension Based on Pre-Training Model\",\"authors\":\"Lufeng Yuan, Wei Zhang, Xiaoxin Gao, Linlin Zhao, Bin Liu, Maokai Liu\",\"doi\":\"10.1145/3487075.3487157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We research how to read and understand Chinese legal documents. At first, we analyze the difficulties of reading comprehension of Chinese legal documents. Data imbalance exists seriously among span extraction query, yes/no query and unanswerable query, that is span extraction queries account for more than 80%. The reading comprehension of Chinese legal documents is a typical long text reading problem. Then we propose a framework for reading and understanding Chinese legal documents. Based on the Bert pre-training model, the framework performs fine-tine for Chinese legal documents, adopts a variety of deep learning models, and uses data enhancement and ensemble strategy to solve reading comprehension of Chinese legal documents. Finally, we test the framework with real legal documents, and the macro average F value can reach 82.773.\",\"PeriodicalId\":354966,\"journal\":{\"name\":\"Proceedings of the 5th International Conference on Computer Science and Application Engineering\",\"volume\":\"253 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Conference on Computer Science and Application Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3487075.3487157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487075.3487157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Chinese Legal Document Reading Comprehension Based on Pre-Training Model
We research how to read and understand Chinese legal documents. At first, we analyze the difficulties of reading comprehension of Chinese legal documents. Data imbalance exists seriously among span extraction query, yes/no query and unanswerable query, that is span extraction queries account for more than 80%. The reading comprehension of Chinese legal documents is a typical long text reading problem. Then we propose a framework for reading and understanding Chinese legal documents. Based on the Bert pre-training model, the framework performs fine-tine for Chinese legal documents, adopts a variety of deep learning models, and uses data enhancement and ensemble strategy to solve reading comprehension of Chinese legal documents. Finally, we test the framework with real legal documents, and the macro average F value can reach 82.773.