{"title":"How Deep is Your Law? Predicting Associations Between Cases in Philippine Jurisprudence","authors":"M. A. Martija, J. Domoguen, P. Naval","doi":"10.1109/TENCON.2019.8929425","DOIUrl":null,"url":null,"abstract":"We explore the problem of finding semantically relevant court rulings from a current legal case by proposing a convolutional neural network (CNN), long short-term memory network (LSTM), and Doc2Vec models for multi-class multi-label text classification. Specifically, the problem we aim to solve is: given a case and its statement of facts, what are the other cases that are closely related to it and may be used as resources to build arguments for the given case? The dataset was scraped and extracted from the LawPhil Project of the Arellano Law Foundation. To extract the suitable features for the models, inputs are first cleaned, tokenized and transformed into its dense vector representation. Thus, a Word2Vec model is first trained on the corpus of case laws between 1987–2013 for feature extraction. We used two configurations for our output space: predicting the cases taken from a pool of the top 500 and top 100 most cited cases. Our results show that it is challenging to obtain a good balance between precision and recall. Some model configurations achieve good recall at the expense of precision and vice versa. In addition to CNN and LSTM, we also explored a Doc2Vec model based on trained to find the cases that are most similar semantically to a particular case. Sensitivity analysis is performed regarding the number of recommended similar cases outputted by the Doc2Vec model its effect on recall. Finally, we also explore a problem that is a bit simpler than our primary objective in this paper: predicting the laws associated to a given case. With much less complexity, we expect and are able to show that our models perform relatively better in this problem.","PeriodicalId":36690,"journal":{"name":"Platonic Investigations","volume":"23 1","pages":"886-891"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Platonic Investigations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2019.8929425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 1
Abstract
We explore the problem of finding semantically relevant court rulings from a current legal case by proposing a convolutional neural network (CNN), long short-term memory network (LSTM), and Doc2Vec models for multi-class multi-label text classification. Specifically, the problem we aim to solve is: given a case and its statement of facts, what are the other cases that are closely related to it and may be used as resources to build arguments for the given case? The dataset was scraped and extracted from the LawPhil Project of the Arellano Law Foundation. To extract the suitable features for the models, inputs are first cleaned, tokenized and transformed into its dense vector representation. Thus, a Word2Vec model is first trained on the corpus of case laws between 1987–2013 for feature extraction. We used two configurations for our output space: predicting the cases taken from a pool of the top 500 and top 100 most cited cases. Our results show that it is challenging to obtain a good balance between precision and recall. Some model configurations achieve good recall at the expense of precision and vice versa. In addition to CNN and LSTM, we also explored a Doc2Vec model based on trained to find the cases that are most similar semantically to a particular case. Sensitivity analysis is performed regarding the number of recommended similar cases outputted by the Doc2Vec model its effect on recall. Finally, we also explore a problem that is a bit simpler than our primary objective in this paper: predicting the laws associated to a given case. With much less complexity, we expect and are able to show that our models perform relatively better in this problem.