Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni
{"title":"Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts","authors":"Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni","doi":"10.18653/v1/2022.nllp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.4","url":null,"abstract":"Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts.To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology.We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"27 15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131106385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Cross-Domain Transferability of Text Generation Models for Legal Text","authors":"Vinayshekhar Bannihatti Kumar, Kasturi Bhattacharjee","doi":"10.18653/v1/2022.nllp-1.9","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.9","url":null,"abstract":"Legalese can often be filled with verbose domain-specific jargon which can make it challenging to understand and use for non-experts. Creating succinct summaries of legal documents often makes it easier for user comprehension. However, obtaining labeled data for every domain of legal text is challenging, which makes cross-domain transferability of text generation models for legal text, an important area of research. In this paper, we explore the ability of existing state-of-the-art T5 & BART-based summarization models to transfer across legal domains. We leverage publicly available datasets across four domains for this task, one of which is a new resource for summarizing privacy policies, that we curate and release for academic research. Our experiments demonstrate the low cross-domain transferability of these models, while also highlighting the benefits of combining different domains. Further, we compare the effectiveness of standard metrics for this task and illustrate the vast differences in their performance.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130372621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Li, Jiaxun Gao, D. Inkpen, Wolfgang Alschner
{"title":"Detecting Relevant Differences Between Similar Legal Texts","authors":"Xiang Li, Jiaxun Gao, D. Inkpen, Wolfgang Alschner","doi":"10.18653/v1/2022.nllp-1.24","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.24","url":null,"abstract":"Given two similar legal texts, is it useful to be able to focus only on the parts that contain relevant differences. However, because of variation in linguistic structure and terminology, it is not easy to identify true semantic differences. An accurate difference detection model between similar legal texts is therefore in demand, in order to increase the efficiency of legal research and document analysis. In this paper, we automatically label a training dataset of sentence pairs using an existing legal resource of international investment treaties that were already manually annotated with metadata. Then we propose models based on state-of-the-art deep learning techniques for the novel task of detecting relevant differences. In addition to providing solutions for this task, we include models for automatically producing metadata for the treaties that do not have it.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115863943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reto Gubelmann, Peter Hongler, Elina Margadant, S. Handschuh
{"title":"On What it Means to Pay Your Fair Share: Towards Automatically Mapping Different Conceptions of Tax Justice in Legal Research Literature","authors":"Reto Gubelmann, Peter Hongler, Elina Margadant, S. Handschuh","doi":"10.18653/v1/2022.nllp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.2","url":null,"abstract":"In this article, we explore the potential and challenges of applying transformer-based pre-trained language models (PLMs) and statistical methods to a particularly challenging, yet highly important and largely uncharted domain: normative discussions in tax law research. On our conviction, the role of NLP in this essentially contested territory is to make explicit implicit normative assumptions, and to foster debates across ideological divides. To this goal, we propose the first steps towards a method that automatically labels normative statements in tax law research, and that suggests the normative background of these statements. Our results are encouraging, but it is clear that there is still room for improvement.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116860748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}