{"title":"金发姑娘的可重复性研究:为 TAR 对 BERT 进行恰到好处的调整","authors":"Xinyu Mao, B. Koopman, G. Zuccon","doi":"10.48550/arXiv.2401.08104","DOIUrl":null,"url":null,"abstract":"Screening documents is a tedious and time-consuming aspect of high-recall retrieval tasks, such as compiling a systematic literature review, where the goal is to identify all relevant documents for a topic. To help streamline this process, many Technology-Assisted Review (TAR) methods leverage active learning techniques to reduce the number of documents requiring review. BERT-based models have shown high effectiveness in text classification, leading to interest in their potential use in TAR workflows. In this paper, we investigate recent work that examined the impact of further pre-training epochs on the effectiveness and efficiency of a BERT-based active learning pipeline. We first report that we could replicate the original experiments on two specific TAR datasets, confirming some of the findings: importantly, that further pre-training is critical to high effectiveness, but requires attention in terms of selecting the correct training epoch. We then investigate the generalisability of the pipeline on a different TAR task, that of medical systematic reviews. In this context, we show that there is no need for further pre-training if a domain-specific BERT backbone is used within the active learning pipeline. This finding provides practical implications for using the studied active learning pipeline within domain-specific TAR tasks.","PeriodicalId":126309,"journal":{"name":"European Conference on Information Retrieval","volume":"78 1","pages":"132-146"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Reproducibility Study of Goldilocks: Just-Right Tuning of BERT for TAR\",\"authors\":\"Xinyu Mao, B. Koopman, G. Zuccon\",\"doi\":\"10.48550/arXiv.2401.08104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Screening documents is a tedious and time-consuming aspect of high-recall retrieval tasks, such as compiling a systematic literature review, where the goal is to identify all relevant documents for a topic. To help streamline this process, many Technology-Assisted Review (TAR) methods leverage active learning techniques to reduce the number of documents requiring review. BERT-based models have shown high effectiveness in text classification, leading to interest in their potential use in TAR workflows. In this paper, we investigate recent work that examined the impact of further pre-training epochs on the effectiveness and efficiency of a BERT-based active learning pipeline. We first report that we could replicate the original experiments on two specific TAR datasets, confirming some of the findings: importantly, that further pre-training is critical to high effectiveness, but requires attention in terms of selecting the correct training epoch. We then investigate the generalisability of the pipeline on a different TAR task, that of medical systematic reviews. In this context, we show that there is no need for further pre-training if a domain-specific BERT backbone is used within the active learning pipeline. This finding provides practical implications for using the studied active learning pipeline within domain-specific TAR tasks.\",\"PeriodicalId\":126309,\"journal\":{\"name\":\"European Conference on Information Retrieval\",\"volume\":\"78 1\",\"pages\":\"132-146\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Conference on Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2401.08104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.08104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
筛选文档是高检索任务(如编纂系统性文献综述)的一个繁琐而耗时的环节,其目标是识别某一主题的所有相关文档。为了帮助简化这一过程,许多技术辅助审查(TAR)方法利用主动学习技术来减少需要审查的文档数量。基于 BERT 的模型在文本分类中表现出了很高的效率,从而引起了人们对其在 TAR 工作流中的潜在应用的兴趣。在本文中,我们研究了最近的一项工作,该工作考察了进一步预训练历时对基于 BERT 的主动学习管道的有效性和效率的影响。我们首先报告说,我们可以在两个特定的 TAR 数据集上重复原来的实验,从而证实了一些发现:重要的是,进一步的预训练对高效率至关重要,但需要注意选择正确的训练历元。然后,我们在不同的 TAR 任务(即医学系统综述)上研究了该管道的通用性。在这种情况下,我们发现,如果在主动学习管道中使用特定领域的 BERT 骨干,就不需要进一步的预训练。这一发现为在特定领域的 TAR 任务中使用所研究的主动学习管道提供了实际意义。
A Reproducibility Study of Goldilocks: Just-Right Tuning of BERT for TAR
Screening documents is a tedious and time-consuming aspect of high-recall retrieval tasks, such as compiling a systematic literature review, where the goal is to identify all relevant documents for a topic. To help streamline this process, many Technology-Assisted Review (TAR) methods leverage active learning techniques to reduce the number of documents requiring review. BERT-based models have shown high effectiveness in text classification, leading to interest in their potential use in TAR workflows. In this paper, we investigate recent work that examined the impact of further pre-training epochs on the effectiveness and efficiency of a BERT-based active learning pipeline. We first report that we could replicate the original experiments on two specific TAR datasets, confirming some of the findings: importantly, that further pre-training is critical to high effectiveness, but requires attention in terms of selecting the correct training epoch. We then investigate the generalisability of the pipeline on a different TAR task, that of medical systematic reviews. In this context, we show that there is no need for further pre-training if a domain-specific BERT backbone is used within the active learning pipeline. This finding provides practical implications for using the studied active learning pipeline within domain-specific TAR tasks.