{"title":"VNLawBERT: A Vietnamese Legal Answer Selection Approach Using BERT Language Model","authors":"Chieu-Nguyen Chau, Truong-Son Nguyen, Le-Minh Nguyen","doi":"10.1109/NICS51282.2020.9335906","DOIUrl":null,"url":null,"abstract":"Recently, with the development of NLP (Natural Language Processing) methods and Deep Learning, there are several solutions to the problems in question answering systems that achieve superior results. However, there are not many solutions to question-answering systems in the Vietnamese legal domain. In this research, we propose an answer selection approach by fine-tuning the BERT language model on our Vietnamese legal question-answer pair corpus and achieve an 87% F1-Score. We further pre-train the original BERT model on a Vietnamese legal domain-specific corpus and achieve a higher F1-Score than the original BERT at 90.6% on the same task, which could reveal the potential of a new pre-trained language model in the legal area.","PeriodicalId":308944,"journal":{"name":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS51282.2020.9335906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Recently, with the development of NLP (Natural Language Processing) methods and Deep Learning, there are several solutions to the problems in question answering systems that achieve superior results. However, there are not many solutions to question-answering systems in the Vietnamese legal domain. In this research, we propose an answer selection approach by fine-tuning the BERT language model on our Vietnamese legal question-answer pair corpus and achieve an 87% F1-Score. We further pre-train the original BERT model on a Vietnamese legal domain-specific corpus and achieve a higher F1-Score than the original BERT at 90.6% on the same task, which could reveal the potential of a new pre-trained language model in the legal area.