Stopping Methods for Technology Assisted Reviews based on Point Processes

IF 5.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Mark Stevenson, Reem Bin-Hezam
{"title":"Stopping Methods for Technology Assisted Reviews based on Point Processes","authors":"Mark Stevenson, Reem Bin-Hezam","doi":"10.1145/3631990","DOIUrl":null,"url":null,"abstract":"Technology Assisted Review (TAR), which aims to reduce the effort required to screen collections of documents for relevance, is used to develop systematic reviews of medical evidence and identify documents that must be disclosed in response to legal proceedings. Stopping methods are algorithms which determine when to stop screening documents during the TAR process, helping to ensure that workload is minimised while still achieving a high level of recall. This paper proposes a novel stopping method based on point processes, which are statistical models that can be used to represent the occurrence of random events. The approach uses rate functions to model the occurrence of relevant documents in the ranking and compares four candidates, including one that has not previously been used for this purpose (hyperbolic). Evaluation is carried out using standard datasets (CLEF e-Health, TREC Total Recall, TREC Legal), and this work is the first to explore stopping method robustness by reporting performance on a range of rankings of varying effectiveness. Results show that the proposed method achieves the desired level of recall without requiring an excessive number of documents to be examined in the majority of cases and also compares well against multiple alternative approaches.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631990","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Technology Assisted Review (TAR), which aims to reduce the effort required to screen collections of documents for relevance, is used to develop systematic reviews of medical evidence and identify documents that must be disclosed in response to legal proceedings. Stopping methods are algorithms which determine when to stop screening documents during the TAR process, helping to ensure that workload is minimised while still achieving a high level of recall. This paper proposes a novel stopping method based on point processes, which are statistical models that can be used to represent the occurrence of random events. The approach uses rate functions to model the occurrence of relevant documents in the ranking and compares four candidates, including one that has not previously been used for this purpose (hyperbolic). Evaluation is carried out using standard datasets (CLEF e-Health, TREC Total Recall, TREC Legal), and this work is the first to explore stopping method robustness by reporting performance on a range of rankings of varying effectiveness. Results show that the proposed method achieves the desired level of recall without requiring an excessive number of documents to be examined in the majority of cases and also compares well against multiple alternative approaches.
基于点过程的技术辅助评审停止方法
技术辅助审查(TAR)的目的是减少对收集的文件进行相关性筛选所需的努力,用于对医学证据进行系统审查,并确定为应对法律诉讼而必须披露的文件。停止方法是确定在TAR过程中何时停止筛选文件的算法,有助于确保在实现高水平召回的同时最大限度地减少工作量。本文提出了一种新的基于点过程的停止方法,点过程是一种可以用来表示随机事件发生的统计模型。该方法使用比率函数对相关文档在排名中的出现情况进行建模,并比较四个候选文档,包括一个以前未用于此目的的文档(双曲线)。使用标准数据集(CLEF e-Health, TREC Total Recall, TREC Legal)进行评估,这项工作是第一个通过在一系列不同有效性排名上报告性能来探索停止方法稳健性的工作。结果表明,在大多数情况下,所提出的方法达到了所需的召回水平,而不需要检查过多的文档,并且与多种替代方法相比也比较好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Information Systems
ACM Transactions on Information Systems 工程技术-计算机:信息系统
CiteScore
9.40
自引率
14.30%
发文量
165
审稿时长
>12 weeks
期刊介绍: The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信