加强巴西高等法院法律信息检索的新概念框架

Thiago Gomes, M. Ladeira
{"title":"加强巴西高等法院法律信息检索的新概念框架","authors":"Thiago Gomes, M. Ladeira","doi":"10.1145/3415958.3433087","DOIUrl":null,"url":null,"abstract":"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice\",\"authors\":\"Thiago Gomes, M. Ladeira\",\"doi\":\"10.1145/3415958.3433087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.\",\"PeriodicalId\":198419,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on Management of Digital EcoSystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3415958.3433087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3415958.3433087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

有效地检索判例(判例法)是实现任何法律制度的一致性和可预测性的必要条件。在这项工作中,我们提出并着手对巴西高等法院的判例检索框架进行实证评估,以便简化具有相同法律意见的其他决定的检索任务。实验结果表明,基于文本相似度的方法比基于布尔查询的法院遗留系统性能更好。构建复杂的布尔查询是非常专业的,我们的目标是提供一个能够使用自由文本作为查询而不需要任何操作符的工具。以遗留系统为基准,我们比较了TF-IDF传统检索模型、BM25概率模型和Word2Vec模型。我们的研究结果表明,在专门的法律语料库和BM25上训练的Word2Vec Skip-Gram模型产生了类似的性能,并且超过了遗留系统。将BM25模型与嵌入模型相结合,性能提高19%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice
Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信