{"title":"The digital transformation of jurisprudence: an evaluation of ChatGPT-4’s applicability to solve cases in business law","authors":"Sascha Schweitzer, Markus Conrads","doi":"10.1007/s10506-024-09406-w","DOIUrl":"10.1007/s10506-024-09406-w","url":null,"abstract":"<div><p>In the evolving landscape of legal information systems, ChatGPT-4 and other advanced conversational agents (CAs) offer the potential to disruptively transform the law industry. This study evaluates commercially available CAs within the German legal context, thereby assessing the generalizability of previous U.S.-based findings. Employing a unique corpus of 200 distinct legal tasks, ChatGPT-4 was benchmarked against Google Bard, Google Gemini, and its predecessor, ChatGPT-3.5. Human-expert and automated assessments of 4000 CA-generated responses reveal ChatGPT-4 to be the first CA to surpass the threshold of solving realistic legal tasks and passing a German business law exam. While ChatGPT-4 outperforms ChatGPT-3.5, Google Bard, and Google Gemini in both consistency and quality, the results demonstrate a considerable degree of variability, especially in complex cases with no predefined response options. Based on these findings, legal professionals should manually verify all texts produced by CAs before use. Novices must exercise caution with CA-generated legal advice, given the expertise needed for its assessment.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"847 - 871"},"PeriodicalIF":3.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09406-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141707348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intermediate factors and precedential constraint","authors":"Trevor Bench-Capon","doi":"10.1007/s10506-024-09405-x","DOIUrl":"10.1007/s10506-024-09405-x","url":null,"abstract":"<div><p>This paper explores the extension of formal accounts of precedential constraint to make use of a factor hierarchy with intermediate factors. A problem arises, however, because constraints expressed in terms of intermediate factors may give different outcomes from those expressed only using base level factors. We argue that constraints that use only base level factors yield the correct outcomes, but that intermediate factors play an important role in the justification and explanation of those outcomes. The discussion is illustrated with a running example.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"827 - 846"},"PeriodicalIF":3.1,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09405-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140962478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-training improves few-shot learning in legal artificial intelligence tasks","authors":"Yulin Zhou, Yongbin Qin, Ruizhang Huang, Yanping Chen, Chuan Lin, Yuan Zhou","doi":"10.1007/s10506-024-09403-z","DOIUrl":"10.1007/s10506-024-09403-z","url":null,"abstract":"<div><p>As the labeling costs in legal artificial intelligence tasks are expensive. Therefore, it becomes a challenge to utilize low cost to train a robust model. In this paper, we propose a LAIAugment approach, which aims to enhance the few-shot learning capability in legal artificial intelligence tasks. Specifically, we first use the self-training approach to label the amount of unlabelled data to enhance the feature learning capability of the model. Moreover, we also search for datasets that are similar to the training set by improving the text similarity function. We conducted experimental analyses for three legal artificial intelligence tasks, including evidence extraction, legal element extraction, and case multi-label prediction, which composed of 3500 judgement documents. The experimental results show that the proposed LAIAugment method has an average F1-score of 72.3% on the three legal AI tasks, which is 1.93% higher than the baseline model. At the same time, it shows a huge improvement in few-shot learning.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"809 - 825"},"PeriodicalIF":3.1,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140966566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI, Law and beyond. A transdisciplinary ecosystem for the future of AI & Law","authors":"Floris J. Bex","doi":"10.1007/s10506-024-09404-y","DOIUrl":"10.1007/s10506-024-09404-y","url":null,"abstract":"<div><p>We live in exciting times for AI and Law: technical developments are moving at a breakneck pace, and at the same time, the call for more robust AI governance and regulation grows stronger. How should we as an AI & Law community navigate these dramatic developments and claims? In this Presidential Address, I present my ideas for a way forward: researching, developing and evaluating real AI systems for the legal field with researchers from AI, Law and beyond. I will demonstrate how we at the Netherlands National Police Lab AI are developing responsible AI by combining insights from different disciplines, and how this connects to the future of our field.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 1","pages":"253 - 270"},"PeriodicalIF":3.1,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09404-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140969820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Japanese tort-case dataset for rationale-supported legal judgment prediction","authors":"Hiroaki Yamada, Takenobu Tokunaga, Ryutaro Ohara, Akira Tokutsu, Keisuke Takeshita, Mihoko Sumida","doi":"10.1007/s10506-024-09402-0","DOIUrl":"10.1007/s10506-024-09402-0","url":null,"abstract":"<div><p>This paper presents the first dataset for Japanese Legal Judgment Prediction (LJP), the Japanese Tort-case Dataset (JTD), which features two tasks: tort prediction and its rationale extraction. The rationale extraction task identifies the court’s accepting arguments from alleged arguments by plaintiffs and defendants, which is a novel task in the field. JTD is constructed based on annotated 3477 Japanese Civil Code judgments by 41 legal experts, resulting in 7978 instances with 59,697 of their alleged arguments from the involved parties. Our baseline experiments show the feasibility of the proposed two tasks, and our error analysis by legal experts identifies sources of errors and suggests future directions of the LJP research.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"783 - 807"},"PeriodicalIF":3.1,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09402-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145110495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"InstructPatentGPT: training patent language models to follow instructions with human feedback","authors":"Jieh-Sheng Lee","doi":"10.1007/s10506-024-09401-1","DOIUrl":"10.1007/s10506-024-09401-1","url":null,"abstract":"<div><p>In this research, patent prosecution is conceptualized as a system of reinforcement learning from human feedback. The objective of the system is to increase the likelihood for a language model to generate patent claims that have a higher chance of being granted. To showcase the controllability of the language model, the system learns from granted patents and pre-grant applications with different rewards. The status of “granted” and “pre-grant” are perceived as labeled human feedback implicitly. In addition, specific to patent drafting, the experiments in this research demonstrate the model’s capability to learn from adjusting claim length and inclusion of limiting terms for narrowing claim scope. As proof of concept, the experiments focus on claim ones only and the training data originates from a patent dataset tailored specifically for artificial intelligence. Although the available human feedback in patent prosecution are limited and the quality of generated patent text requires improvement, the experiments following the 3-stage reinforcement learning from human feedback have demonstrated that generative language models are capable of reflecting the human feedback or intent in patent prosecution. To enhance the usability of language models, the implementation in this research utilizes modern techniques that enable execution on a single consumer-grade GPU. The demonstrated proof of concept, which reduces hardware requirements, will prove valuable in the future as more human feedback in patent prosecution become available for broader use, either within patent offices or in the public domain.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"739 - 782"},"PeriodicalIF":3.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09401-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140999297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Łukasz Górski, Błażej Kuźniacki, Marco Almada, Kamil Tyliński, Madalena Calvo, Pablo Matias Asnaghi, Luciano Almada, Hilario Iñiguez, Fernando Rubianes, Octavio Pera, Juan Ignacio Nigrelli
{"title":"Exploring explainable AI in the tax domain","authors":"Łukasz Górski, Błażej Kuźniacki, Marco Almada, Kamil Tyliński, Madalena Calvo, Pablo Matias Asnaghi, Luciano Almada, Hilario Iñiguez, Fernando Rubianes, Octavio Pera, Juan Ignacio Nigrelli","doi":"10.1007/s10506-024-09395-w","DOIUrl":"10.1007/s10506-024-09395-w","url":null,"abstract":"<div><p>This paper analyses whether current explainable AI (XAI) techniques can help to address taxpayer concerns about the use of AI in taxation. As tax authorities around the world increase their use of AI-based techniques, taxpayers are increasingly at a loss about whether and how the ensuing decisions follow the procedures required by law and respect their substantive rights. The use of XAI has been proposed as a response to this issue, but it is still an open question whether current XAI techniques are enough to meet existing legal requirements. The paper approaches this question in the context of a case study: a prototype tax fraud detector trained on an anonymized dataset of real-world cases handled by the Buenos Aires (Argentina) tax authority. The decisions produced by this detector are explained through the use of various classification methods, and the outputs of these explanation models are evaluated on their explanatory power and on their compliance with the legal obligation that tax authorities provide the rationale behind their decision-making. We conclude the paper by suggesting technical and legal approaches for designing explanation mechanisms that meet the needs of legal explanation in the tax domain.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"551 - 579"},"PeriodicalIF":3.1,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09395-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141005043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clement Guitton, Aurelia Tamò-Larrieux, Simon Mayer, Gijs van Dijck
{"title":"The challenge of open-texture in law","authors":"Clement Guitton, Aurelia Tamò-Larrieux, Simon Mayer, Gijs van Dijck","doi":"10.1007/s10506-024-09390-1","DOIUrl":"10.1007/s10506-024-09390-1","url":null,"abstract":"<div><p>An important challenge when creating automatically processable laws concerns open-textured terms. The ability to measure open-texture can assist in determining the feasibility of encoding regulation and where additional legal information is required to properly assess a legal issue or dispute. In this article, we propose a novel conceptualisation of open-texture with the aim of determining the extent of open-textured terms in legal documents. We <i>conceptualise open-texture as a lever</i> whose state is impacted by three types of forces: internal forces (the words within the text themselves), external forces (the resources brought to challenge the definition of words), and lateral forces (the merit of such challenges). We tested part of this conceptualisation with 26 participants by investigating agreement in paired annotators. Five key findings emerged. First, agreement on which words are open-texture within a legal text is possible and statistically significant. Second, agreement is even high at an average inter-rater reliability of 0.7 (Cohen’s kappa). Third, when there is agreement on the words, agreement on the Open-Texture Value is high. Fourth, there is a dependence between the Open-Texture Value and reasons invoked behind open-texture. Fifth, involving only four annotators can yield similar results compared to involving twenty more when it comes to only flagging <i>clauses</i> containing open-texture. We conclude the article by discussing limitations of our experiment and which remaining questions in real life cases are still outstanding.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"405 - 435"},"PeriodicalIF":3.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09390-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140694059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large language models in cryptocurrency securities cases: can a GPT model meaningfully assist lawyers?","authors":"Arianna Trozze, Toby Davies, Bennett Kleinberg","doi":"10.1007/s10506-024-09399-6","DOIUrl":"10.1007/s10506-024-09399-6","url":null,"abstract":"<div><p>Large Language Models (LLMs) could be a useful tool for lawyers. However, empirical research on their effectiveness in conducting legal tasks is scant. We study securities cases involving cryptocurrencies as one of numerous contexts where AI could support the legal process, studying GPT-3.5’s legal reasoning and ChatGPT’s legal drafting capabilities. We examine whether a) GPT-3.5 can accurately determine which laws are potentially being violated from a fact pattern, and b) whether there is a difference in juror decision-making based on complaints written by a lawyer compared to ChatGPT. We feed fact patterns from real-life cases to GPT-3.5 and evaluate its ability to determine correct potential violations from the scenario and exclude spurious violations. Second, we had mock jurors assess complaints written by ChatGPT and lawyers. GPT-3.5’s legal reasoning skills proved weak, though we expect improvement in future models, particularly given the violations it suggested tended to be correct (it merely missed additional, correct violations). ChatGPT performed better at legal drafting, and jurors’ decisions were not statistically significantly associated with the author of the document upon which they based their decisions. Because GPT-3.5 cannot satisfactorily conduct legal reasoning tasks, it would be unlikely to be able to help lawyers in a meaningful way at this stage. However, ChatGPT’s drafting skills (though, perhaps, still inferior to lawyers) could assist lawyers in providing legal services. Our research is the first to systematically study an LLM’s legal drafting and reasoning capabilities in litigation, as well as in securities law and cryptocurrency-related misconduct.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"691 - 737"},"PeriodicalIF":3.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09399-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140731464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Galassi, Francesca Lagioia, Agnieszka Jabłonowska, Marco Lippi
{"title":"Unfair clause detection in terms of service across multiple languages","authors":"Andrea Galassi, Francesca Lagioia, Agnieszka Jabłonowska, Marco Lippi","doi":"10.1007/s10506-024-09398-7","DOIUrl":"10.1007/s10506-024-09398-7","url":null,"abstract":"<div><p>Most of the existing natural language processing systems for legal texts are developed for the English language. Nevertheless, there are several application domains where multiple versions of the same documents are provided in different languages, especially inside the European Union. One notable example is given by Terms of Service (ToS). In this paper, we compare different approaches to the task of detecting potential unfair clauses in ToS across multiple languages. In particular, after developing an annotated corpus and a machine learning classifier for English, we consider and compare several strategies to extend the system to other languages: building a novel corpus and training a novel machine learning system for each language, from scratch; projecting annotations across documents in different languages, to avoid the creation of novel corpora; translating training documents while keeping the original annotations; translating queries at prediction time and relying on the English system only. An extended experimental evaluation conducted on a large, original dataset indicates that the time-consuming task of re-building a novel annotated corpus for each language can often be avoided with no significant degradation in terms of performance.\u0000</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"641 - 689"},"PeriodicalIF":3.1,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09398-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140748324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}