{"title":"Precedent-based reasoning with incomplete information for human-in-the-loop decision support","authors":"Daphne Odekerken, Floris Bex, Henry Prakken","doi":"10.1007/s10506-024-09421-x","DOIUrl":"10.1007/s10506-024-09421-x","url":null,"abstract":"<div><p>We define and study the notions of stability and relevance for precedent-based reasoning, focusing on Horty’s result model of precedential constraint. According to this model, precedents constrain the possible outcomes for a focus case, which is a yet undecided case, where precedents and the focus case are compared on their characteristics (called dimensions). In this paper, we refer to the enforced outcome for the focus case as its <i>justification</i> status. In contrast to earlier work, we do not assume that all dimension values of the focus case or the precedent cases have been established with certainty: rather, each dimension is assigned a set of possible values. We define a focus case as <i>stable</i> if its justification status is the same for every choice of the possible values. For focus cases that are not stable, we study the task of identifying <i>relevance</i>: which possible values should be excluded to make the focus case stable? In addition, we introduce the notion of <i>possibility</i> to verify if a user can assign an outcome to an unstable focus case without making the case base of precedents inconsistent. We show how the tasks of identifying justification, stability, relevance and possibility can be applied for human-in-the-loop decision support. Finally, we discuss the computational complexity of these tasks and provide efficient algorithms.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"107 - 152"},"PeriodicalIF":3.1,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09421-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147559198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"It cannot be right if it was written by AI: on lawyers’ preferences of documents perceived as authored by an LLM vs a human","authors":"Jakub Harasta, Tereza Novotná, Jaromir Savelka","doi":"10.1007/s10506-024-09422-w","DOIUrl":"10.1007/s10506-024-09422-w","url":null,"abstract":"<div><p>Large Language Models (LLMs) enable a future in which certain types of legal documents may be generated automatically. This has a great potential to streamline legal processes, lower the cost of legal services, and dramatically increase access to justice. While many researchers focus on proposing and evaluating LLM-based applications supporting tasks in the legal domain, there is a notable lack of investigations into how legal professionals perceive content if they believe an LLM has generated it. Yet, this is a critical point as over-reliance or unfounded scepticism may influence whether such documents bring about appropriate legal consequences. This study is the necessary analysis of the ongoing transition towards mature generative AI systems. Specifically, we examined whether the perception of legal documents’ by lawyers and law students (n = 75) varies based on their assumed origin (human-crafted vs AI-generated). The participants evaluated the documents, focusing on their correctness and language quality. Our analysis revealed a clear preference for documents perceived as crafted by a human over those believed to be generated by AI. At the same time, most participants expect the future in which documents will be generated automatically. These findings could be leveraged by legal practitioners, policymakers, and legislators to implement and adopt legal document generation technology responsibly and to fuel the necessary discussions on how legal processes should be updated to reflect recent technological developments.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"153 - 190"},"PeriodicalIF":3.1,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147559069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabiana Di Porto, Paolo Fantozzi, Maurizio Naldi, Nicoletta Rangone
{"title":"Mining EU consultations through AI","authors":"Fabiana Di Porto, Paolo Fantozzi, Maurizio Naldi, Nicoletta Rangone","doi":"10.1007/s10506-024-09426-6","DOIUrl":"10.1007/s10506-024-09426-6","url":null,"abstract":"<div><p>Consultations are key to gather evidence that informs rulemaking. When analysing the feedback received, it is essential for the regulator to appropriately cluster stakeholders’ opinions, as misclustering may alter the representativeness of the positions, making some of them appear majoritarian when they might not be. The European Commission (EC)’s approach to clustering opinions in consultations lacks a standardized methodology, leading to reduced procedural transparency, while making use of computational tools only sporadically. This paper explores how natural language processing (NLP) technologies may enhance the way opinion clustering is currently conducted by the EC. We examine 830 responses to three legislative proposals (the Artificial Intelligence Act, the Digital Markets Act and the Digital Services Act) using both a lexical and semantic approach. We find that some groups (like small and medium companies) have low similarity across all datasets and methodologies despite being clustered in one opinion group by the EC. The same happens for citizens and consumer associations for the consultation run over the DSA. These results suggest that computational tools actually help reduce misclustering of stakeholders’ opinions and consequently allow greater representativeness of the different positions expressed in consultations. They further suggest that the EC could identify a convergent methodology for all its consultations, where such tools are employed in a consistent and replicable rather than occasionally. Ideally, it should also explain when one methodology is preferred to another. This effort should find its way into the Better Regulation toolbox (EC 2023). Our analysis also paves the way for further research to reach a transparent and consistent methodology for group clustering.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"267 - 304"},"PeriodicalIF":3.1,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09426-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147562013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gijs van Dijck, Carlos Aguilera, Shashank M. Chakravarthy
{"title":"Deciphering disagreement in the annotation of EU legislation","authors":"Gijs van Dijck, Carlos Aguilera, Shashank M. Chakravarthy","doi":"10.1007/s10506-024-09423-9","DOIUrl":"10.1007/s10506-024-09423-9","url":null,"abstract":"<div><p>The topic of annotating legal data has received surprisingly little attention. A key challenge of the annotation process is reaching a sufficient agreement between annotators and filtering mistakes from genuine disagreement. This study presents an approach that provides insights into and resolves potential disagreement amongst annotators. It (1) introduces different strategies to calculate agreement levels and compares (2) agreement levels between annotators (inter-annotator agreement) before and after a revision round and (3) agreement levels for annotators who annotate the same texts twice (intra-annotator agreement). The inter-annotator agreement levels are compared to a revision round in which an arbiter corrected the annotator’s labels. The analysis is based on the annotation of EU legislative provisions at two stages (initial annotations, after annotator revisions) and for various tasks (Definitions, References, Quantities, IF-THEN statements, Exceptions, Scope, Hierarchy, Deontic Clauses, Active and Passive Role) by multiple annotators. The results reveal that agreement levels vary based on the stage of measurement (before/after revisions), the nature of the task, the method of assessment, and the annotator combination. The agreement scores - along with some initial measurements—align with those reported in previous research but increase after each revision round. This suggests that annotator revisions can substantially reduce disagreement. Additionally, disagreements were found not only between but also among annotators. This inconsistency does not appear to stem from a lack of understanding of the guidelines or a lack of seriousness in task execution, as evidenced by moderate to substantial inter-annotator agreement scores. These findings suggest that annotators identified multiple valid interpretations, which highlights the complexity of annotating legislative provisions. The results underscore the significance of embracing, addressing, and reporting about (dis)agreement in different ways and at the various stages of an annotation task.\u0000</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"191 - 226"},"PeriodicalIF":3.1,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09423-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147559529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Subinay Adhikary, Procheta Sen, Dwaipayan Roy, Kripabandhu Ghosh
{"title":"A case study for automated attribute extraction from legal documents using large language models","authors":"Subinay Adhikary, Procheta Sen, Dwaipayan Roy, Kripabandhu Ghosh","doi":"10.1007/s10506-024-09425-7","DOIUrl":"10.1007/s10506-024-09425-7","url":null,"abstract":"<div><p>The escalating number of pending cases is a growing concern worldwide. Recent advancements in digitization have opened up possibilities for leveraging artificial intelligence (AI) tools in the processing of legal documents. Adopting a structured representation for legal documents, as opposed to a mere bag-of-words flat text representation, can significantly enhance processing capabilities. With the aim of achieving this objective, we put forward a set of diverse attributes for criminal case proceedings. To enhance the effectiveness of automatically extracting these attributes from legal documents within a sequence labeling framework, we propose the utilization of a few-shot learning approach based on Large Language Models (LLMs). Moreover, we demonstrate the efficacy of the extracted attributes in downstream tasks, such as <i>legal judgment prediction and legal statute prediction</i>.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"245 - 266"},"PeriodicalIF":3.1,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09425-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147558841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advancing legal recommendation system with enhanced Bayesian network machine learning","authors":"Xukang Wang, Vanessa Hoo, Mingyue Liu, Jiale Li, Ying Cheng Wu","doi":"10.1007/s10506-024-09424-8","DOIUrl":"10.1007/s10506-024-09424-8","url":null,"abstract":"<div><p>The integration of machine learning algorithms into the legal recommendation system marks a burgeoning area of research, with a particular focus on enhancing the accuracy and efficiency of judicial decision-making processes. The application of Bayesian Network (BN) emerges as a potent tool in this context, promising to address the inherent complexities and unique nuances of legal texts and individual case subtleties. However, the challenge of achieving high accuracy in BN parameter learning, especially under conditions of limited data, remains a significant hurdle. This study proposes an Enhanced Maximum Parameter Learning (EMPL) algorithm, tailored for BN parameter optimization in scenarios characterized by small sample sizes. The EMPL algorithm, innovatively incorporating the Synthetic Minority Over-sampling Technique (SMOTE), begins with the formulation of inequality constraints derived from domain expertise. It establishes a minimal dataset threshold necessary for effective parameter learning. Through the introduction of an index weighting factor function that dynamically adjusts according to the sample size, the algorithm facilitates the derivation of refined BN parameters. The core innovation of the EMPL algorithm lies in its use of an exponentially weighted factor function, designed to be responsive to variations in sample size, and its capacity to expand the parameter space using SMOTE to align with qualitative constraints from expert insights. This approach enables the integration of data-derived parameters with those obtained through expert experience in an exponentially weighted manner, culminating in the optimization of BN parameters. Comparative analysis reveals that the EMPL algorithm achieves superior learning accuracy over traditional Maximum Likelihood Estimation (MLE) and qualitative maximum a posteriori (QMAP) approach, particularly in contexts of sparse data. Furthermore, it demonstrates enhanced performance relative to variable weight learning algorithms, underscoring its potential to significantly improve decision-making processes in the legal domain through advanced BN parameter learning.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"227 - 244"},"PeriodicalIF":3.1,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147558680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: The digital transformation of jurisprudence: an evaluation of ChatGPT-4’s applicability to solve cases in business law","authors":"Sascha Schweitzer, Markus Conrads","doi":"10.1007/s10506-024-09417-7","DOIUrl":"10.1007/s10506-024-09417-7","url":null,"abstract":"","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"305 - 309"},"PeriodicalIF":3.1,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09417-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147559482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felipe A. Siqueira, Diany Pressato, Fabíola S. F. Pereira, Nádia F. F. da Silva, Ellen Souza, Márcio S. Dias, André C. P. L. F. de Carvalho
{"title":"Segmenting Brazilian legislative text using weak supervision and active learning","authors":"Felipe A. Siqueira, Diany Pressato, Fabíola S. F. Pereira, Nádia F. F. da Silva, Ellen Souza, Márcio S. Dias, André C. P. L. F. de Carvalho","doi":"10.1007/s10506-024-09419-5","DOIUrl":"10.1007/s10506-024-09419-5","url":null,"abstract":"<div><p>Legislative houses all over the world are adopting tools based on artificial intelligence to support their work. The incorporation of these tools can improve the analysis of the text of the proposed new laws and speed the preparation and discussion of new laws. The performance of artificial intelligence tools for text processing tasks is largely affected by the corpora used, which should ideally be adapted for the specific domain. When dealing with legislative corpora, text segmentation is often necessary due to the distinct purposes of legislative segments within the overall bill structure. While rule-based approaches can be effective in cases where the data follows a consistent format, they fail when inconsistencies arise in the formatting of legislative bills. In this study, we extensively investigate the use of weak supervision and active learning to accurately segment over 100,000 Brazilian federal legislative bills using a sequence tagging approach. The experiments demonstrated that both BERT and LSTM models achieved high statistical performance without the limitations of rule-based systems. In segmenting long documents beyond the limited context window of BERT, we find that simple moving windows suffice because the required context for accurate legislative segmentation is mostly local. We also conducted an analysis of transfer learning from our monolingual models to French, Italian, German, and English (US) legislative texts. According to our experimental results our models present non-trivial zero-shot and effective out-of-distribution fine-tuning performance, suggesting potential avenues for multilingual legislative segmentation without the need for computationally expensive models. The models, data, and code are publicly available at https://github.com/ulysses-camara/ulysses-segmenter.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"1 - 82"},"PeriodicalIF":3.1,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147561888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vasile Păiş, Radu Ion, Elena Irimia, Verginica Barbu Mititelu, Valentin Badea, Dan Tufiș
{"title":"System for the anonymization of Romanian jurisprudence","authors":"Vasile Păiş, Radu Ion, Elena Irimia, Verginica Barbu Mititelu, Valentin Badea, Dan Tufiș","doi":"10.1007/s10506-024-09420-y","DOIUrl":"10.1007/s10506-024-09420-y","url":null,"abstract":"<div><p>The transparency of the judicial process and the consistency of judicial decisions can be improved through their publication. Access to jurisprudence is of paramount importance both for law professionals (judges, lawyers, law students) and for the larger public. However, public access must ensure the preservation of privacy for people involved, in accordance with national and international regulations. This paper presents the work behind building an artificial intelligence system for the anonymization of Romanian jurisprudence, allowing it to be accessed through the ReJust portal operated by the Superior Council of Magistracy in Romania.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"34 1","pages":"83 - 105"},"PeriodicalIF":3.1,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147559196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LAWSUIT: a LArge expert-Written SUmmarization dataset of ITalian constitutional court verdicts","authors":"Luca Ragazzi, Gianluca Moro, Stefano Guidi, Giacomo Frisoni","doi":"10.1007/s10506-024-09414-w","DOIUrl":"10.1007/s10506-024-09414-w","url":null,"abstract":"<div><p>Large-scale public datasets are vital for driving the progress of abstractive summarization, especially in law, where documents have highly specialized jargon. However, the available resources are English-centered, limiting research advancements in other languages. This paper introduces <span>LAWSUIT</span>, a collection of 14K Italian legal verdicts with expert-authored abstractive maxims drawn from the Constitutional Court of the Italian Republic. <span>LAWSUIT</span> presents an arduous task with lengthy source texts and evenly distributed salient content. We offer extensive experiments with sequence-to-sequence and segmentation-based approaches, revealing that the latter achieve better results in full and few-shot settings. We openly release <span>LAWSUIT</span> to foster the development and automation of real-world legal applications.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 4","pages":"1151 - 1187"},"PeriodicalIF":3.1,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09414-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145493398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}