S. Georgette Graham, Hamidreza Soltani, Olufemi Isiaq
{"title":"Natural language processing for legal document review: categorising deontic modalities in contracts","authors":"S. Georgette Graham, Hamidreza Soltani, Olufemi Isiaq","doi":"10.1007/s10506-023-09379-2","DOIUrl":"10.1007/s10506-023-09379-2","url":null,"abstract":"<div><p>The contract review process can be a costly and time-consuming task for lawyers and clients alike, requiring significant effort to identify and evaluate the legal implications of individual clauses. To address this challenge, we propose the use of natural language processing techniques, specifically text classification based on deontic tags, to streamline the process. Our research question is whether natural language processing techniques, specifically dense vector embeddings, can help semi-automate the contract review process and reduce time and costs for legal professionals reviewing deontic modalities in contracts. In this study, we create a domain-specific dataset and train both baseline and neural network models for contract sentence classification. This approach offers a more efficient and cost-effective solution for contract review, mimicking the work of a lawyer. Our approach achieves an accuracy of 0.90, showcasing its effectiveness in identifying and evaluating individual contract sentences.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"79 - 100"},"PeriodicalIF":3.1,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135043189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcos Aurélio Domingues, Edleno Silva de Moura, Leandro Balby Marinho, Altigran da Silva
{"title":"A large scale benchmark for session-based recommendations on the legal domain","authors":"Marcos Aurélio Domingues, Edleno Silva de Moura, Leandro Balby Marinho, Altigran da Silva","doi":"10.1007/s10506-023-09378-3","DOIUrl":"10.1007/s10506-023-09378-3","url":null,"abstract":"<div><p>The proliferation of legal documents in various formats and their dispersion across multiple courts present a significant challenge for users seeking precise matches to their information requirements. Despite notable advancements in legal information retrieval systems, research into legal recommender systems remains limited. A plausible factor contributing to this scarcity could be the absence of extensive publicly accessible datasets or benchmarks. While a few studies have emerged in this field, a comprehensive analysis of the distinct attributes of legal data that influence the design of effective legal recommenders is notably absent in the current literature. This paper addresses this gap by initially amassing a comprehensive session-based dataset from Jusbrasil, one of Brazil’s largest online legal platforms. Subsequently, we scrutinize and discourse key facets of legal session-based recommendation data, including session duration, types of recommendable legal artifacts, coverage, and popularity. Furthermore, we introduce the first session-based recommendation benchmark tailored to the legal domain, shedding light on the performance and constraints of several renowned session-based recommendation approaches. These evaluations are based on real-world data sourced from Jusbrasil.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"43 - 78"},"PeriodicalIF":3.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135113045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating legal event and context information for Chinese similar case analysis","authors":"Jingpei Dan, Lanlin Xu, Yuming Wang","doi":"10.1007/s10506-023-09377-4","DOIUrl":"10.1007/s10506-023-09377-4","url":null,"abstract":"<div><p>Similar case analysis (SCA) is an essential topic in legal artificial intelligence, serving as a reference for legal professionals. Most existing works treat SCA as a traditional text classification task and ignore some important legal elements that affect the verdict and case similarity, like legal events, and thus are easily misled by semantic structure. To address this issue, we propose a Legal Event-Context Model named LECM to improve the accuracy and interpretability of SCA based on Chinese legal corpus. The event-context integration mechanism, which is an essential component of the LECM, is proposed to integrate the legal event and context information based on the attention mechanism, enabling legal events to be associated with their corresponding relevant contexts. We introduce an event detection module to obtain the legal event information, which is pre-trained on a legal event detection dataset to avoid labeling events manually. We conduct extensive experiments on two SCA tasks, i.e., similar case matching (SCM) and similar case retrieval (SCR). Compared with baseline models, LECM is validated by about 13% and 11% average improvement in terms of mean average precision and accuracy respectively, for SCR and SCM tasks. These results indicate that LECM effectively utilizes event-context knowledge to enhance SCA performance and its potential application in various legal document analysis tasks.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"1 - 42"},"PeriodicalIF":3.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135217653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel network-based paragraph filtering technique for legal document similarity analysis","authors":"Mayur Makawana, Rupa G. Mehta","doi":"10.1007/s10506-023-09375-6","DOIUrl":"https://doi.org/10.1007/s10506-023-09375-6","url":null,"abstract":"","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135779227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gianluca Moro, Nicola Piscaglia, Luca Ragazzi, Paolo Italiani
{"title":"Multi-language transfer learning for low-resource legal case summarization","authors":"Gianluca Moro, Nicola Piscaglia, Luca Ragazzi, Paolo Italiani","doi":"10.1007/s10506-023-09373-8","DOIUrl":"10.1007/s10506-023-09373-8","url":null,"abstract":"<div><p>Analyzing and evaluating legal case reports are labor-intensive tasks for judges and lawyers, who usually base their decisions on report abstracts, legal principles, and commonsense reasoning. Thus, summarizing legal documents is time-consuming and requires excellent human expertise. Moreover, public legal corpora of specific languages are almost unavailable. This paper proposes a transfer learning approach with extractive and abstractive techniques to cope with the lack of labeled legal summarization datasets, namely a low-resource scenario. In particular, we conducted extensive multi- and cross-language experiments. The proposed work outperforms the state-of-the-art results of extractive summarization on the Australian Legal Case Reports dataset and sets a new baseline for abstractive summarization. Finally, syntactic and semantic metrics assessments have been carried out to evaluate the accuracy and the factual consistency of the machine-generated legal summaries.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1111 - 1139"},"PeriodicalIF":3.1,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09373-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135768568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raphaël Gyory, David Restrepo Amariles, Gregory Lewkowicz, Hugues Bersini
{"title":"Ant: a process aware annotation software for regulatory compliance","authors":"Raphaël Gyory, David Restrepo Amariles, Gregory Lewkowicz, Hugues Bersini","doi":"10.1007/s10506-023-09372-9","DOIUrl":"10.1007/s10506-023-09372-9","url":null,"abstract":"<div><p>Accurate data annotation is essential to successfully implementing machine learning (ML) for regulatory compliance. Annotations allow organizations to train supervised ML algorithms and to adapt and audit the software they buy. The lack of annotation tools focused on regulatory data is slowing the adoption of established ML methodologies and process models, such as CRISP-DM, in various legal domains, including in regulatory compliance. This article introduces Ant, an open-source annotation software for regulatory compliance. Ant is designed to adapt to complex organizational processes and enable compliance experts to be in control of ML projects. By drawing on Business Process Modeling (BPM), we show that Ant can contribute to lift major technical bottlenecks to effectively implement regulatory compliance through software, such as the access to multiple sources of heterogeneous data and the integration of process complexities in the ML pipeline. We provide empirical data to validate the performance of Ant, illustrate its potential to speed up the adoption of ML in regulatory compliance, and highlight its limitations.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1075 - 1110"},"PeriodicalIF":3.1,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-023-09372-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42450021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lessons learned building a legal inference dataset","authors":"Sungmi Park, Joshua I. James","doi":"10.1007/s10506-023-09370-x","DOIUrl":"10.1007/s10506-023-09370-x","url":null,"abstract":"<div><p>Legal inference is fundamental for building and verifying hypotheses in police investigations. In this study, we build a Natural Language Inference dataset in Korean for the legal domain, focusing on criminal court verdicts. We developed an adversarial hypothesis collection tool that can challenge the annotators and give us a deep understanding of the data, and a hypothesis network construction tool with visualized graphs to show a use case scenario of the developed model. The data is augmented using a combination of Easy Data Augmentation approaches and round-trip translation, as crowd-sourcing might not be an option for datasets with sensible data. We extensively discuss challenges we have encountered, such as the annotator’s limited domain knowledge, issues in the data augmentation process, problems with handling long contexts and suggest possible solutions to the issues. Our work shows that creating legal inference datasets with limited resources is feasible and proposes further research in this area.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1011 - 1044"},"PeriodicalIF":3.1,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48589915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniela Vianna, Edleno Silva de Moura, Altigran Soares da Silva
{"title":"A topic discovery approach for unsupervised organization of legal document collections","authors":"Daniela Vianna, Edleno Silva de Moura, Altigran Soares da Silva","doi":"10.1007/s10506-023-09371-w","DOIUrl":"10.1007/s10506-023-09371-w","url":null,"abstract":"<div><p>Technology has substantially transformed the way legal services operate in many different countries. With a large and complex collection of digitized legal documents, the judiciary system worldwide presents a promising scenario for the development of intelligent tools. In this work, we tackle the challenging task of organizing and summarizing the constantly growing collection of legal documents, uncovering hidden topics, or themes that later can support tasks such as legal case retrieval and legal judgment prediction. Our approach to this problem relies on topic discovery techniques combined with a variety of preprocessing techniques and learning-based vector representations of words, such as Doc2Vec and BERT-like models. The proposed method was validated using four different datasets composed of short and long legal documents in Brazilian Portuguese, from legal decisions to chapters in legal books. Analysis conducted by a team of legal specialists revealed the effectiveness of the proposed approach to uncover unique and relevant topics from large collections of legal documents, serving many purposes, such as giving support to legal case retrieval tools and also providing the team of legal specialists with a tool that can accelerate their work of labeling/tagging legal documents.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 4","pages":"1045 - 1074"},"PeriodicalIF":3.1,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46420377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen
{"title":"Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model","authors":"Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen","doi":"10.1007/s10506-023-09367-6","DOIUrl":"10.1007/s10506-023-09367-6","url":null,"abstract":"<div><p>Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson’s terms. Second, we construct a <i>Legal Question Bank</i>, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive <i>CLIC Recommender</i>. Given a user’s verbal description of a legal situation that requires a legal solution, CRec interprets the user’s input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"769 - 805"},"PeriodicalIF":3.1,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42058228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos Rafael Rodríguez Rodríguez, Yarina Amoroso Fernández, Denis Sergeevich Zuev, Marieta Peña Abreu, Yeleny Zulueta Veliz
{"title":"M-LAMAC: a model for linguistic assessment of mitigating and aggravating circumstances of criminal responsibility using computing with words","authors":"Carlos Rafael Rodríguez Rodríguez, Yarina Amoroso Fernández, Denis Sergeevich Zuev, Marieta Peña Abreu, Yeleny Zulueta Veliz","doi":"10.1007/s10506-023-09365-8","DOIUrl":"10.1007/s10506-023-09365-8","url":null,"abstract":"<div><p>The general mitigating and aggravating circumstances of criminal liability are elements attached to the crime that, when they occur, affect the punishment quantum. Cuban criminal legislation provides a catalog of such circumstances and some general conditions for their application. Such norms give judges broad discretion in assessing circumstances and adjusting punishment based on the intensity of those circumstances. In the interest of broad judicial discretion, the law does not establish specific ways for measuring circumstances’ intensity. This gives judges more freedom and autonomy, but it also imposes on them more social responsibility and challenges them to manage the uncertainty and subjectivity inherent in this complex activity. This paper proposes a model to aid the linguistic assessment of circumstances’ intensity and to provide linguistic and numerical recommendations to determine an appropriate punishment interval. M-LAMAC determines the collective evaluation of circumstances of the same type, determines the prevalence of a type of circumstance by means of a compensation function, recommends the required modification in the input interval, and finally recommends a numerical interval adjusted to the judges’ initially expressed preferences. The model’s applicability is demonstrated by means of several experiments on a fictitious case of bank document forgery.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"32 3","pages":"697 - 739"},"PeriodicalIF":3.1,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48842789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}