{"title":"基于自注意的生物医学实体识别与规范化标签联合模型","authors":"Dandan Zhou, Tong Liu","doi":"10.1117/12.2653583","DOIUrl":null,"url":null,"abstract":"To address the error propagation problem of joint modeling of biomedical named entity recognition and normalization, joint label is designed to combine entity labels with concept labels to jointly label each term in the sentence, the joint learning task is transformed into a multiclass classification problem. A joint model of biomedical entity recognition and normalization labels based on self-attention is designed, the pre-training model BioBERT is used to encode the medical text. After extracting the joint label information using the self-attention mechanism, it is fused with the input sequence information. Finally, the final joint label representation is obtained by softmax. The experimental results show that the F1 values of the entity recognition and normalization tasks on the NCBI dataset reach 83.3% and 84.5%, and the F1 values on the BC5CDR dataset reach 84.2% and 86.6%, which are better compared with existing methods.","PeriodicalId":32903,"journal":{"name":"JITeCS Journal of Information Technology and Computer Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Joint model of biomedical entity recognition and normalization labels based on self-attention\",\"authors\":\"Dandan Zhou, Tong Liu\",\"doi\":\"10.1117/12.2653583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To address the error propagation problem of joint modeling of biomedical named entity recognition and normalization, joint label is designed to combine entity labels with concept labels to jointly label each term in the sentence, the joint learning task is transformed into a multiclass classification problem. A joint model of biomedical entity recognition and normalization labels based on self-attention is designed, the pre-training model BioBERT is used to encode the medical text. After extracting the joint label information using the self-attention mechanism, it is fused with the input sequence information. Finally, the final joint label representation is obtained by softmax. The experimental results show that the F1 values of the entity recognition and normalization tasks on the NCBI dataset reach 83.3% and 84.5%, and the F1 values on the BC5CDR dataset reach 84.2% and 86.6%, which are better compared with existing methods.\",\"PeriodicalId\":32903,\"journal\":{\"name\":\"JITeCS Journal of Information Technology and Computer Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JITeCS Journal of Information Technology and Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2653583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JITeCS Journal of Information Technology and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2653583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Joint model of biomedical entity recognition and normalization labels based on self-attention
To address the error propagation problem of joint modeling of biomedical named entity recognition and normalization, joint label is designed to combine entity labels with concept labels to jointly label each term in the sentence, the joint learning task is transformed into a multiclass classification problem. A joint model of biomedical entity recognition and normalization labels based on self-attention is designed, the pre-training model BioBERT is used to encode the medical text. After extracting the joint label information using the self-attention mechanism, it is fused with the input sequence information. Finally, the final joint label representation is obtained by softmax. The experimental results show that the F1 values of the entity recognition and normalization tasks on the NCBI dataset reach 83.3% and 84.5%, and the F1 values on the BC5CDR dataset reach 84.2% and 86.6%, which are better compared with existing methods.