Hugo Mentzingen , Nuno António , Fernando Bacao , Marcio Cunha
{"title":"发现法律先例的文本相似性:评估机器学习技术在行政法庭中的表现","authors":"Hugo Mentzingen , Nuno António , Fernando Bacao , Marcio Cunha","doi":"10.1016/j.jjimei.2024.100247","DOIUrl":null,"url":null,"abstract":"<div><p>The importance of legal precedents in ensuring consistent jurisprudence is undisputed. Particularly in jurisdictions following the Common law, but even in Civil law systems, uniformity in case law requires adherence to precedents. However, with the growing volume of cases, manual identification becomes a bottleneck, prompting the need for automation. Leveraging the capabilities of natural language processing (NLP) and machine learning (ML), our study delves into the potential of automation in identifying similar cases indicative of precedents. Drawing from a unique, substantial dataset of legal cases from an administrative court in Brazil, we extensively evaluated over one hundred combinations of document representations and text vectorizations. Contrary to earlier studies that relied on minimal validation samples, ours employed a statistically significant sample vetted by legal experts. Our findings reveal that models focusing on granular text representations perform optimally, especially when extracting concepts and relations. Notably, while intricate models may not always guarantee superior outcomes, the importance of refining textual features cannot be understated. These findings pave the way for creating efficient decision support systems in judicial contexts and set a direction for future research aiming to integrate technology in legal decision-making.</p></div>","PeriodicalId":100699,"journal":{"name":"International Journal of Information Management Data Insights","volume":"4 2","pages":"Article 100247"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667096824000363/pdfft?md5=27abd719154af3d76e4033b1afbe7e3d&pid=1-s2.0-S2667096824000363-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Textual similarity for legal precedents discovery: Assessing the performance of machine learning techniques in an administrative court\",\"authors\":\"Hugo Mentzingen , Nuno António , Fernando Bacao , Marcio Cunha\",\"doi\":\"10.1016/j.jjimei.2024.100247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The importance of legal precedents in ensuring consistent jurisprudence is undisputed. Particularly in jurisdictions following the Common law, but even in Civil law systems, uniformity in case law requires adherence to precedents. However, with the growing volume of cases, manual identification becomes a bottleneck, prompting the need for automation. Leveraging the capabilities of natural language processing (NLP) and machine learning (ML), our study delves into the potential of automation in identifying similar cases indicative of precedents. Drawing from a unique, substantial dataset of legal cases from an administrative court in Brazil, we extensively evaluated over one hundred combinations of document representations and text vectorizations. Contrary to earlier studies that relied on minimal validation samples, ours employed a statistically significant sample vetted by legal experts. Our findings reveal that models focusing on granular text representations perform optimally, especially when extracting concepts and relations. Notably, while intricate models may not always guarantee superior outcomes, the importance of refining textual features cannot be understated. These findings pave the way for creating efficient decision support systems in judicial contexts and set a direction for future research aiming to integrate technology in legal decision-making.</p></div>\",\"PeriodicalId\":100699,\"journal\":{\"name\":\"International Journal of Information Management Data Insights\",\"volume\":\"4 2\",\"pages\":\"Article 100247\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667096824000363/pdfft?md5=27abd719154af3d76e4033b1afbe7e3d&pid=1-s2.0-S2667096824000363-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Information Management Data Insights\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667096824000363\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Management Data Insights","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667096824000363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Textual similarity for legal precedents discovery: Assessing the performance of machine learning techniques in an administrative court
The importance of legal precedents in ensuring consistent jurisprudence is undisputed. Particularly in jurisdictions following the Common law, but even in Civil law systems, uniformity in case law requires adherence to precedents. However, with the growing volume of cases, manual identification becomes a bottleneck, prompting the need for automation. Leveraging the capabilities of natural language processing (NLP) and machine learning (ML), our study delves into the potential of automation in identifying similar cases indicative of precedents. Drawing from a unique, substantial dataset of legal cases from an administrative court in Brazil, we extensively evaluated over one hundred combinations of document representations and text vectorizations. Contrary to earlier studies that relied on minimal validation samples, ours employed a statistically significant sample vetted by legal experts. Our findings reveal that models focusing on granular text representations perform optimally, especially when extracting concepts and relations. Notably, while intricate models may not always guarantee superior outcomes, the importance of refining textual features cannot be understated. These findings pave the way for creating efficient decision support systems in judicial contexts and set a direction for future research aiming to integrate technology in legal decision-making.