A Simple Data Augmentation Method to Improve the Performance of Named Entity Recognition Models in Medical Domain

2021 6th International Conference on Computer Science and Engineering (UBMK) Pub Date : 2021-09-15 DOI:10.1109/UBMK52708.2021.9558986

Abdul Majeed Issifu, M. Ganiz

引用次数: 7

Abstract

Easy Data Augmentation is originally developed for text classification tasks. It consists of four basic methods: Synonym Replacement, Random Insertion, Random Deletion, and Random Swap. They yield accuracy improvements on several deep neural network models. In this study we apply these methods to a new domain. We augment Named Entity Recognition datasets from medical domain. Although the augmentation task is much more difficult due to the nature of named entities which consist of word or word groups in the sentences, we show that we can improve the named entity recognition performance.

查看原文本刊更多论文

一种提高医学领域命名实体识别模型性能的简单数据增强方法

Easy Data Augmentation最初是为文本分类任务开发的。它包括四种基本方法:同义词替换、随机插入、随机删除和随机交换。它们在几个深度神经网络模型上提高了精度。在这项研究中，我们将这些方法应用到一个新的领域。我们增强了医学领域的命名实体识别数据集。尽管由于命名实体是由句子中的词或词组组成的，因此增强任务更加困难，但我们表明我们可以提高命名实体的识别性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 6th International Conference on Computer Science and Engineering (UBMK)

自引率

0.00%

发文量