How Deep is Your Law? Predicting Associations Between Cases in Philippine Jurisprudence

Q2 Arts and Humanities

Platonic Investigations Pub Date : 2019-10-01 DOI:10.1109/TENCON.2019.8929425

M. A. Martija, J. Domoguen, P. Naval

{"title":"How Deep is Your Law? Predicting Associations Between Cases in Philippine Jurisprudence","authors":"M. A. Martija, J. Domoguen, P. Naval","doi":"10.1109/TENCON.2019.8929425","DOIUrl":null,"url":null,"abstract":"We explore the problem of finding semantically relevant court rulings from a current legal case by proposing a convolutional neural network (CNN), long short-term memory network (LSTM), and Doc2Vec models for multi-class multi-label text classification. Specifically, the problem we aim to solve is: given a case and its statement of facts, what are the other cases that are closely related to it and may be used as resources to build arguments for the given case? The dataset was scraped and extracted from the LawPhil Project of the Arellano Law Foundation. To extract the suitable features for the models, inputs are first cleaned, tokenized and transformed into its dense vector representation. Thus, a Word2Vec model is first trained on the corpus of case laws between 1987–2013 for feature extraction. We used two configurations for our output space: predicting the cases taken from a pool of the top 500 and top 100 most cited cases. Our results show that it is challenging to obtain a good balance between precision and recall. Some model configurations achieve good recall at the expense of precision and vice versa. In addition to CNN and LSTM, we also explored a Doc2Vec model based on trained to find the cases that are most similar semantically to a particular case. Sensitivity analysis is performed regarding the number of recommended similar cases outputted by the Doc2Vec model its effect on recall. Finally, we also explore a problem that is a bit simpler than our primary objective in this paper: predicting the laws associated to a given case. With much less complexity, we expect and are able to show that our models perform relatively better in this problem.","PeriodicalId":36690,"journal":{"name":"Platonic Investigations","volume":"23 1","pages":"886-891"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Platonic Investigations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2019.8929425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}

引用次数: 1

Abstract

We explore the problem of finding semantically relevant court rulings from a current legal case by proposing a convolutional neural network (CNN), long short-term memory network (LSTM), and Doc2Vec models for multi-class multi-label text classification. Specifically, the problem we aim to solve is: given a case and its statement of facts, what are the other cases that are closely related to it and may be used as resources to build arguments for the given case? The dataset was scraped and extracted from the LawPhil Project of the Arellano Law Foundation. To extract the suitable features for the models, inputs are first cleaned, tokenized and transformed into its dense vector representation. Thus, a Word2Vec model is first trained on the corpus of case laws between 1987–2013 for feature extraction. We used two configurations for our output space: predicting the cases taken from a pool of the top 500 and top 100 most cited cases. Our results show that it is challenging to obtain a good balance between precision and recall. Some model configurations achieve good recall at the expense of precision and vice versa. In addition to CNN and LSTM, we also explored a Doc2Vec model based on trained to find the cases that are most similar semantically to a particular case. Sensitivity analysis is performed regarding the number of recommended similar cases outputted by the Doc2Vec model its effect on recall. Finally, we also explore a problem that is a bit simpler than our primary objective in this paper: predicting the laws associated to a given case. With much less complexity, we expect and are able to show that our models perform relatively better in this problem.

查看原文本刊更多论文

你的律法有多深?预测菲律宾法理学案例之间的关联

我们通过提出卷积神经网络(CNN)、长短期记忆网络(LSTM)和Doc2Vec模型来探索从当前法律案件中找到语义相关的法院裁决的问题，这些模型用于多类别多标签文本分类。具体来说，我们要解决的问题是:给定一个案例和它对事实的陈述，还有哪些其他案例与它密切相关，可以作为资源来为给定的案例建立论点?数据集是从阿雷拉诺法律基金会的LawPhil项目中抓取和提取的。为了提取适合模型的特征，首先对输入进行清理、标记并将其转换为密集向量表示。因此，首先在1987-2013年的案例法语料库上训练Word2Vec模型进行特征提取。我们对输出空间使用了两种配置:预测来自前500名和前100名最常被引用案例的案例。我们的研究结果表明，在准确率和召回率之间取得良好的平衡是具有挑战性的。一些模型配置以牺牲精度为代价实现了良好的召回，反之亦然。除了CNN和LSTM，我们还探索了基于训练的Doc2Vec模型，以找到与特定案例语义最相似的案例。对Doc2Vec模型输出的推荐相似案例数量及其对召回率的影响进行敏感性分析。最后，我们还探讨了一个比本文的主要目标更简单的问题:预测与给定案例相关的法律。由于复杂性大大降低，我们期望并且能够证明我们的模型在这个问题中表现得相对更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Platonic Investigations Arts and Humanities-Philosophy

CiteScore

0.30

自引率

0.00%

发文量