利用瓶颈适配器在低资源约束下识别临床记录中的癌症

Workshop on Biomedical Natural Language Processing Pub Date : 2022-10-17 DOI:10.48550/arXiv.2210.09440

Omid Rohanian, Hannah Jauncey, Mohammadmahdi Nouriborji, Bronner P. Gonccalves, C. Kartsonaki, Isaric Clinical Characterisation Group, L. Merson, D. Clifton

{"title":"利用瓶颈适配器在低资源约束下识别临床记录中的癌症","authors":"Omid Rohanian, Hannah Jauncey, Mohammadmahdi Nouriborji, Bronner P. Gonccalves, C. Kartsonaki, Isaric Clinical Characterisation Group, L. Merson, D. Clifton","doi":"10.48550/arXiv.2210.09440","DOIUrl":null,"url":null,"abstract":"Processing information locked within clinical health records is a challenging task that remains an active area of research in biomedical NLP. In this work, we evaluate a broad set of machine learning techniques ranging from simple RNNs to specialised transformers such as BioBERT on a dataset containing clinical notes along with a set of annotations indicating whether a sample is cancer-related or not. Furthermore, we specifically employ efficient fine-tuning methods from NLP, namely, bottleneck adapters and prompt tuning, to adapt the models to our specialised task. Our evaluations suggest that fine-tuning a frozen BERT model pre-trained on natural language and with bottleneck adapters outperforms all other strategies, including full fine-tuning of the specialised BioBERT model. Based on our findings, we suggest that using bottleneck adapters in low-resource situations with limited access to labelled data or processing capacity could be a viable strategy in biomedical text mining.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints\",\"authors\":\"Omid Rohanian, Hannah Jauncey, Mohammadmahdi Nouriborji, Bronner P. Gonccalves, C. Kartsonaki, Isaric Clinical Characterisation Group, L. Merson, D. Clifton\",\"doi\":\"10.48550/arXiv.2210.09440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Processing information locked within clinical health records is a challenging task that remains an active area of research in biomedical NLP. In this work, we evaluate a broad set of machine learning techniques ranging from simple RNNs to specialised transformers such as BioBERT on a dataset containing clinical notes along with a set of annotations indicating whether a sample is cancer-related or not. Furthermore, we specifically employ efficient fine-tuning methods from NLP, namely, bottleneck adapters and prompt tuning, to adapt the models to our specialised task. Our evaluations suggest that fine-tuning a frozen BERT model pre-trained on natural language and with bottleneck adapters outperforms all other strategies, including full fine-tuning of the specialised BioBERT model. Based on our findings, we suggest that using bottleneck adapters in low-resource situations with limited access to labelled data or processing capacity could be a viable strategy in biomedical text mining.\",\"PeriodicalId\":200974,\"journal\":{\"name\":\"Workshop on Biomedical Natural Language Processing\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on Biomedical Natural Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2210.09440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Biomedical Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.09440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

处理锁定在临床健康记录中的信息是一项具有挑战性的任务，也是生物医学NLP研究的一个活跃领域。在这项工作中，我们评估了一组广泛的机器学习技术，从简单的rnn到专门的转换器，如BioBERT，在一个包含临床记录的数据集上，以及一组指示样本是否与癌症相关的注释。此外，我们特别采用来自NLP的有效微调方法，即瓶颈适配器和提示调整，以使模型适应我们的专业任务。我们的评估表明，对预先在自然语言和瓶颈适配器上训练过的冻结BERT模型进行微调优于所有其他策略，包括对专门的BioBERT模型进行全面微调。基于我们的研究结果，我们建议在资源匮乏的情况下使用瓶颈适配器，对标记数据的访问或处理能力有限，这可能是生物医学文本挖掘的一个可行策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints

Processing information locked within clinical health records is a challenging task that remains an active area of research in biomedical NLP. In this work, we evaluate a broad set of machine learning techniques ranging from simple RNNs to specialised transformers such as BioBERT on a dataset containing clinical notes along with a set of annotations indicating whether a sample is cancer-related or not. Furthermore, we specifically employ efficient fine-tuning methods from NLP, namely, bottleneck adapters and prompt tuning, to adapt the models to our specialised task. Our evaluations suggest that fine-tuning a frozen BERT model pre-trained on natural language and with bottleneck adapters outperforms all other strategies, including full fine-tuning of the specialised BioBERT model. Based on our findings, we suggest that using bottleneck adapters in low-resource situations with limited access to labelled data or processing capacity could be a viable strategy in biomedical text mining.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Workshop on Biomedical Natural Language Processing

自引率

0.00%

发文量