Artificial Intelligence and Law最新文献_第2页

Correction to: Code is law: how COMPAS affects the way the judiciary handles the risk of recidivism 更正：法典即法律：COMPAS 如何影响司法机构处理累犯风险的方式

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-04-02 DOI: 10.1007/s10506-024-09400-2

Christopher Engel, Lorenz Linhardt, Marcel Schubert

引用次数: 0

Re-evaluating GPT-4’s bar exam performance 重新评估 GPT-4 的律师资格考试成绩

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-03-30 DOI: 10.1007/s10506-024-09396-9

Eric Martínez

{"title":"Re-evaluating GPT-4’s bar exam performance","authors":"Eric Martínez","doi":"10.1007/s10506-024-09396-9","DOIUrl":"10.1007/s10506-024-09396-9","url":null,"abstract":"<div><p>Perhaps the most widely touted of GPT-4’s at-launch, zero-shot capabilities has been its reported 90th-percentile performance on the Uniform Bar Exam. This paper begins by investigating the methodological challenges in documenting and verifying the 90th-percentile claim, presenting four sets of findings that indicate that OpenAI’s estimates of GPT-4’s UBE percentile are overinflated. First, although GPT-4’s UBE score nears the 90th percentile when examining approximate conversions from February administrations of the Illinois Bar Exam, these estimates are heavily skewed towards repeat test-takers who failed the July administration and score significantly lower than the general test-taking population. Second, data from a recent July administration of the same exam suggests GPT-4’s overall UBE percentile was below the 69th percentile, and <span>(sim)</span>48th percentile on essays. Third, examining official NCBE data and using several conservative statistical assumptions, GPT-4’s performance against first-time test takers is estimated to be <span>(sim)</span>62nd percentile, including <span>(sim)</span>42nd percentile on essays. Fourth, when examining only those who passed the exam (i.e. licensed or license-pending attorneys), GPT-4’s performance is estimated to drop to <span>(sim)</span>48th percentile overall, and <span>(sim)</span>15th percentile on essays. In addition to investigating the validity of the percentile claim, the paper also investigates the validity of GPT-4’s reported scaled UBE score of 298. The paper successfully replicates the MBE score, but highlights several methodological issues in the grading of the MPT + MEE components of the exam, which call into question the validity of the reported essay score. Finally, the paper investigates the effect of different hyperparameter combinations on GPT-4’s MBE performance, finding no significant effect of adjusting temperature settings, and a significant effect of few-shot chain-of-thought prompting over basic zero-shot prompting. Taken together, these findings carry timely insights for the desirability and feasibility of outsourcing legally relevant tasks to AI models, as well as for the importance for AI developers to implement rigorous and transparent capabilities evaluations to help secure safe and trustworthy AI.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"581 - 604"},"PeriodicalIF":3.1,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09396-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140362183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Boosting court judgment prediction and explanation using legal entities 利用法律实体提高法院判决的预测和解释能力

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-03-18 DOI: 10.1007/s10506-024-09397-8

Irene Benedetto, Alkis Koudounas, Lorenzo Vaiani, Eliana Pastor, Luca Cagliero, Francesco Tarasconi, Elena Baralis

{"title":"Boosting court judgment prediction and explanation using legal entities","authors":"Irene Benedetto, Alkis Koudounas, Lorenzo Vaiani, Eliana Pastor, Luca Cagliero, Francesco Tarasconi, Elena Baralis","doi":"10.1007/s10506-024-09397-8","DOIUrl":"10.1007/s10506-024-09397-8","url":null,"abstract":"<div><p>The automatic prediction of court case judgments using Deep Learning and Natural Language Processing is challenged by the variety of norms and regulations, the inherent complexity of the forensic language, and the length of legal judgments. Although state-of-the-art transformer-based architectures and Large Language Models (LLMs) are pre-trained on large-scale datasets, the underlying model reasoning is not transparent to the legal expert. This paper jointly addresses court judgment prediction and explanation by not only predicting the judgment but also providing legal experts with sentence-based explanations. To boost the performance of both tasks we leverage a legal named entity recognition step, which automatically annotates documents with meaningful domain-specific entity tags and masks the corresponding fine-grained descriptions. In such a way, transformer-based architectures and Large Language Models can attend to in-domain entity-related information in the inference process while neglecting irrelevant details. Furthermore, the explainer can boost the relevance of entity-enriched sentences while limiting the diffusion of potentially sensitive information. We also explore the use of in-context learning and lightweight fine-tuning to tailor LLMs to the legal language style and the downstream prediction and explanation tasks. The results obtained on a benchmark dataset from the Indian judicial system show the superior performance of entity-aware approaches to both judgment prediction and explanation.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 3","pages":"605 - 640"},"PeriodicalIF":3.1,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140232862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comparative user study of human predictions in algorithm-supported recidivism risk assessment 算法支持的累犯风险评估中人类预测的比较用户研究

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-03-15 DOI: 10.1007/s10506-024-09393-y

Manuel Portela, Carlos Castillo, Songül Tolan, Marzieh Karimi-Haghighi, Antonio Andres Pueyo

{"title":"A comparative user study of human predictions in algorithm-supported recidivism risk assessment","authors":"Manuel Portela, Carlos Castillo, Songül Tolan, Marzieh Karimi-Haghighi, Antonio Andres Pueyo","doi":"10.1007/s10506-024-09393-y","DOIUrl":"10.1007/s10506-024-09393-y","url":null,"abstract":"<div><p>In this paper, we study the effects of using an algorithm-based risk assessment instrument (RAI) to support the prediction of risk of violent recidivism upon release. The instrument we used is a machine learning version of RiskCanvi used by the Justice Department of <i>Catalonia, Spain</i>. It was hypothesized that people can improve their performance on defining the risk of recidivism when assisted with a RAI. Also, that professionals can perform better than non-experts on the domain. Participants had to predict whether a person who has been released from prison will commit a new crime leading to re-incarceration, within the next two years. This user study is done with (1) <i>general</i> participants from diverse backgrounds recruited through a crowdsourcing platform, (2) <i>targeted</i> participants who are students and practitioners of data science, criminology, or social work and professionals who work with RisCanvi. We also run focus groups with participants of the <i>targeted</i> study, including people who use <i>RisCanvi</i> in a professional capacity, to interpret the quantitative results. Among other findings, we observe that algorithmic support systematically leads to more accurate predictions from all participants, but that statistically significant gains are only seen in the performance of <i>targeted</i> participants with respect to that of crowdsourced participants. Among other comments, professional participants indicate that they would not foresee using a fully-automated system in criminal risk assessment, but do consider it valuable for training, standardization, and to fine-tune or double-check their predictions on particularly difficult cases. We found that the revised prediction by using a RAI increases the performance of all groups, while professionals show a better performance in general. And, a RAI can be considered for extending professional capacities and skills along their careers.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"471 - 517"},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09393-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145143808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Legal sentence boundary detection using hybrid deep learning and statistical models 使用混合深度学习和统计模型检测法律句子边界

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-03-14 DOI: 10.1007/s10506-024-09394-x

Reshma Sheik, Sneha Rao Ganta, S. Jaya Nirmala

{"title":"Legal sentence boundary detection using hybrid deep learning and statistical models","authors":"Reshma Sheik, Sneha Rao Ganta, S. Jaya Nirmala","doi":"10.1007/s10506-024-09394-x","DOIUrl":"10.1007/s10506-024-09394-x","url":null,"abstract":"<div><p>Sentence boundary detection (SBD) represents an important first step in natural language processing since accurately identifying sentence boundaries significantly impacts downstream applications. Nevertheless, detecting sentence boundaries within legal texts poses a unique and challenging problem due to their distinct structural and linguistic features. Our approach utilizes deep learning models to leverage delimiter and surrounding context information as input, enabling precise detection of sentence boundaries in English legal texts. We evaluate various deep learning models, including domain-specific transformer models like LegalBERT and CaseLawBERT. To assess the efficacy of our deep learning models, we compare them with a state-of-the-art domain-specific statistical conditional random field (CRF) model. After considering model size, F1-score, and inference time, we identify the Convolutional Neural Network Model (CNN) as the top-performing deep learning model. To further enhance performance, we integrate the features of the CNN model into the subsequent CRF model, creating a hybrid architecture that combines the strengths of both models. Our experiments demonstrate that the hybrid model outperforms the baseline model, achieving a 4% improvement in the F1-score. Additional experiments showcase the superiority of the hybrid model over SBD open-source libraries when confronted with an out-of-domain test set. These findings underscore the importance of efficient SBD in legal texts and emphasize the advantages of employing deep learning models and hybrid architectures to achieve optimal performance.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"519 - 549"},"PeriodicalIF":3.1,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140243978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction to: Reasoning with inconsistent precedents 更正：根据不一致的先例进行推理

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-02-16 DOI: 10.1007/s10506-024-09392-z

Ilaria Canavotto

引用次数: 0

Combining prompt-based language models and weak supervision for labeling named entity recognition on legal documents 将基于提示的语言模型与弱监督相结合，用于法律文件的标注命名实体识别

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-02-15 DOI: 10.1007/s10506-023-09388-1

Vitor Oliveira, Gabriel Nogueira, Thiago Faleiros, Ricardo Marcacini

{"title":"Combining prompt-based language models and weak supervision for labeling named entity recognition on legal documents","authors":"Vitor Oliveira, Gabriel Nogueira, Thiago Faleiros, Ricardo Marcacini","doi":"10.1007/s10506-023-09388-1","DOIUrl":"10.1007/s10506-023-09388-1","url":null,"abstract":"<div><p>Named entity recognition (NER) is a very relevant task for text information retrieval in natural language processing (NLP) problems. Most recent state-of-the-art NER methods require humans to annotate and provide useful data for model training. However, using human power to identify, circumscribe and label entities manually can be very expensive in terms of time, money, and effort. This paper investigates the use of prompt-based language models (OpenAI’s GPT-3) and weak supervision in the legal domain. We apply both strategies as alternative approaches to the traditional human-based annotation method, relying on computer power instead human effort for labeling, and subsequently compare model performance between computer and human-generated data. We also introduce combinations of all three mentioned methods (prompt-based, weak supervision, and human annotation), aiming to find ways to maintain high model efficiency and low annotation costs. We showed that, despite human labeling still maintaining better overall performance results, the alternative strategies and their combinations presented themselves as valid options, displaying positive results and similar model scores at lower costs. Final results demonstrate preservation of human-trained models scores averaging 74.0% for GPT-3, 95.6% for weak supervision, 90.7% for GPT + weak supervision combination, and 83.9% for GPT + 30% human-labeling combination.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"361 - 381"},"PeriodicalIF":3.1,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139775198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Agents preserving privacy on intelligent transportation systems according to EU law 根据欧盟法律保护智能交通系统隐私的代理商

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-02-12 DOI: 10.1007/s10506-024-09391-0

Javier Carbo, Juanita Pedraza, Jose M. Molina

{"title":"Agents preserving privacy on intelligent transportation systems according to EU law","authors":"Javier Carbo, Juanita Pedraza, Jose M. Molina","doi":"10.1007/s10506-024-09391-0","DOIUrl":"10.1007/s10506-024-09391-0","url":null,"abstract":"<div><p>Intelligent Transportation Systems are expected to automate how parking slots are booked by trucks. The intrinsic dynamic nature of this problem, the need of explanations and the inclusion of private data justify an agent-based solution. Agents solving this problem act with a Believe Desire Intentions reasoning, and are implemented with JASON. Privacy of trucks becomes protected sharing a list of parkings ordered by preference. Furthermore, the process of assigning parking slots takes into account legal requirements on breaks and driving time limits. Finally, the agent simulations use the distances, the number of trucks and parkings corresponding to the proportions of the current European Union data. The performance of the proposed solution is tested in these simulations with three different distances against an alternative with complete knowledge. The difference in efficiency, the number of illegal breaks and the traveled distances are measured in them. Comparing the results, we can conclude that the nonprivate alternative is slightly better in performance while both alternatives do not produce illegal breaks. In this way the simulations show that the proposed privacy protection does not impose a relevant handicap in efficiency.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"437 - 470"},"PeriodicalIF":3.1,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10506-024-09391-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139782882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Code is law: how COMPAS affects the way the judiciary handles the risk of recidivism 法典即法律：COMPAS 如何影响司法机构处理累犯风险的方式

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-02-09 DOI: 10.1007/s10506-024-09389-8

Christoph Engel, Lorenz Linhardt, Marcel Schubert

引用次数: 0

To test or not to test? A question of rational decision making in forensic biology 检验还是不检验？法医生物学中的理性决策问题

IF 3.1 2区社会学

Artificial Intelligence and Law Pub Date : 2024-01-30 DOI: 10.1007/s10506-023-09386-3

Simone Gittelson, Franco Taroni

{"title":"To test or not to test? A question of rational decision making in forensic biology","authors":"Simone Gittelson, Franco Taroni","doi":"10.1007/s10506-023-09386-3","DOIUrl":"10.1007/s10506-023-09386-3","url":null,"abstract":"<div><p>How can the forensic scientist rationally justify performing a sequence of tests and analyses in a particular case? When is it worth performing a test or analysis on an item? Currently, there is a large void in logical frameworks for making rational decisions in forensic science. The aim of this paper is to fill this void by presenting a step-by-step guide on how to apply Bayesian decision theory to routine decision problems encountered by forensic scientists on performing or not performing a particular laboratory test or analysis. A decision-theoretic framework, composed of actions, states of nature, and utilities, models this problem, and an influence diagram translates its notions into a probabilistic graphical network. Within this framework, the expected value of information (EVOI) for the submission of an item to a particular test or analysis addresses the above questions. The development of a classical case example on whether to perform presumptive tests for blood before submitting the item for a DNA analysis illustrates the use of this model for source level questions in forensic biology (i.e., questions that ask whether a crime stain consisting of a particular body fluid comes from a particular person). We show how to construct an influence diagram for this example, and how sensitivity analyses lead to an optimal analytical sequence. The key idea is to show that such a Bayesian decisional approach provides a coherent framework for justifying the optimal analytical sequence for a particular case in forensic science.</p></div>","PeriodicalId":51336,"journal":{"name":"Artificial Intelligence and Law","volume":"33 2","pages":"293 - 322"},"PeriodicalIF":3.1,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140484820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0