检索-修订-再完善:检索简明内涵法律文章集的新框架

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Chau Nguyen, Phuong Nguyen, Le-Minh Nguyen
{"title":"检索-修订-再完善:检索简明内涵法律文章集的新框架","authors":"Chau Nguyen,&nbsp;Phuong Nguyen,&nbsp;Le-Minh Nguyen","doi":"10.1016/j.ipm.2024.103949","DOIUrl":null,"url":null,"abstract":"<div><div>The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103949"},"PeriodicalIF":7.4000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set\",\"authors\":\"Chau Nguyen,&nbsp;Phuong Nguyen,&nbsp;Le-Minh Nguyen\",\"doi\":\"10.1016/j.ipm.2024.103949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103949\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S030645732400308X\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030645732400308X","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

蕴涵法律条文集检索的目的是找出与法律查询或其否定具有蕴涵关系的简明法律条文集。与注重相关性排序的传统信息检索不同,这项任务要求信息简洁。然而,之前的研究采用传统方法未能充分满足这一需求。为了弥补这一不足,我们提出了一个 "检索-修订-再完善 "的三阶段框架,该框架通过利用小型和大型语言模型(LMs)发挥不同但互补的作用,明确地满足了对简洁性的需求。在 COLIEE 2022 和 2023 数据集上进行的实证评估表明,我们的框架显著提高了性能,宏 F2 分数的绝对值比以前的先进方法分别提高了 3.17% 和 4.24%。具体来说,我们的 "检索"(Retrieve)阶段针对小型 LM 采用了各种量身定制的微调策略,仅在前五名结果中的召回率就超过了 0.90,确保了对相关文章的全面覆盖。在随后的 "修订 "阶段,大型 LMs 缩小了这一范围,在提高精确度的同时牺牲了最小的覆盖范围。精炼阶段利用小型 LM 的专业见解进一步提高了精确度,与以前的方法相比,检索到的简明文章集数量相对提高了 19.15%。我们的框架为进一步研究检索包含法律条文的简明文章集的专门方法提供了一个很有前景的方向,从而更有效地满足任务的需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set
The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信