{"title":"检索-修订-再完善:检索简明内涵法律文章集的新框架","authors":"Chau Nguyen, Phuong Nguyen, Le-Minh Nguyen","doi":"10.1016/j.ipm.2024.103949","DOIUrl":null,"url":null,"abstract":"<div><div>The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103949"},"PeriodicalIF":7.4000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set\",\"authors\":\"Chau Nguyen, Phuong Nguyen, Le-Minh Nguyen\",\"doi\":\"10.1016/j.ipm.2024.103949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103949\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S030645732400308X\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030645732400308X","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set
The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.