Mingyuan Gao, Hao Wei, Fei Chen, Wenqiang Qu, Mingyu Lu
{"title":"生物医学文本命名实体识别的HDCNN-CRF","authors":"Mingyuan Gao, Hao Wei, Fei Chen, Wenqiang Qu, Mingyu Lu","doi":"10.1109/ICSESS47205.2019.9040749","DOIUrl":null,"url":null,"abstract":"Biomedical named entity recognition (BNER) is one of the most basic and important tasks of biomedical text mining. LSTM does not take full advantage of parallelism, making recognition slower. This paper focuses on improving the model structure and proposes a HDCNN-CRF method which combines hybrid dilated convolutional neural network (HDCNN) and conditional random field (CRF). It can not only avoid the expensive cost of human participation in feature construction, but also greatly improve the speed compared with LSTM method in named entity recognition (NER). We use Adam for optimization during model training and the IOBES tagging method for labeling the sequence. The HDCNN-CRF model that does not rely on any costly feature engineering has shown good performances on the NCBI-disease corpus. Due to its high degree of parallelism, the model speed is four times higher than BLSTM.","PeriodicalId":203944,"journal":{"name":"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"HDCNN-CRF for Biomedical Text Named Entity Recognition\",\"authors\":\"Mingyuan Gao, Hao Wei, Fei Chen, Wenqiang Qu, Mingyu Lu\",\"doi\":\"10.1109/ICSESS47205.2019.9040749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biomedical named entity recognition (BNER) is one of the most basic and important tasks of biomedical text mining. LSTM does not take full advantage of parallelism, making recognition slower. This paper focuses on improving the model structure and proposes a HDCNN-CRF method which combines hybrid dilated convolutional neural network (HDCNN) and conditional random field (CRF). It can not only avoid the expensive cost of human participation in feature construction, but also greatly improve the speed compared with LSTM method in named entity recognition (NER). We use Adam for optimization during model training and the IOBES tagging method for labeling the sequence. The HDCNN-CRF model that does not rely on any costly feature engineering has shown good performances on the NCBI-disease corpus. Due to its high degree of parallelism, the model speed is four times higher than BLSTM.\",\"PeriodicalId\":203944,\"journal\":{\"name\":\"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS47205.2019.9040749\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS47205.2019.9040749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
HDCNN-CRF for Biomedical Text Named Entity Recognition
Biomedical named entity recognition (BNER) is one of the most basic and important tasks of biomedical text mining. LSTM does not take full advantage of parallelism, making recognition slower. This paper focuses on improving the model structure and proposes a HDCNN-CRF method which combines hybrid dilated convolutional neural network (HDCNN) and conditional random field (CRF). It can not only avoid the expensive cost of human participation in feature construction, but also greatly improve the speed compared with LSTM method in named entity recognition (NER). We use Adam for optimization during model training and the IOBES tagging method for labeling the sequence. The HDCNN-CRF model that does not rely on any costly feature engineering has shown good performances on the NCBI-disease corpus. Due to its high degree of parallelism, the model speed is four times higher than BLSTM.