{"title":"基于深度多任务学习的生物命名实体识别和角色标记","authors":"Fei Deng, Dongdong Zhang, Jing Peng","doi":"10.1145/3457682.3457751","DOIUrl":null,"url":null,"abstract":"Bioscience is an experimental science. The qualitative and quantitative findings of the biological experiments are often exclusively available in the form of figures in published papers. In this paper, we introduce the SourceData model, which captures a key aspect of the biological experimental design by categorizing biological entity involved in the experiment into one of the six roles. Our work aims at determining whether a given entity is subjected to a perturbation or is the object of a measurement (entity role labeling) through automatic natural language algorithms. We use state-of-the-art transformer models (e.g., Bert and its variants) as a strong baseline, find that after jointly trained with biological named entity recognition task by deep multi-task learning (MTL), the F1 score gets improved by 2% compared to previous single-task architecture. Also, for named entity recognition task, the MTL method achieves comparable performance in five public datasets. Further analysis reveals the importance of fusing entity information at the input layer of entity role labeling task and incorporating global context.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Biological Named Entity Recognition and Role Labeling via Deep Multi-task Learning\",\"authors\":\"Fei Deng, Dongdong Zhang, Jing Peng\",\"doi\":\"10.1145/3457682.3457751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bioscience is an experimental science. The qualitative and quantitative findings of the biological experiments are often exclusively available in the form of figures in published papers. In this paper, we introduce the SourceData model, which captures a key aspect of the biological experimental design by categorizing biological entity involved in the experiment into one of the six roles. Our work aims at determining whether a given entity is subjected to a perturbation or is the object of a measurement (entity role labeling) through automatic natural language algorithms. We use state-of-the-art transformer models (e.g., Bert and its variants) as a strong baseline, find that after jointly trained with biological named entity recognition task by deep multi-task learning (MTL), the F1 score gets improved by 2% compared to previous single-task architecture. Also, for named entity recognition task, the MTL method achieves comparable performance in five public datasets. Further analysis reveals the importance of fusing entity information at the input layer of entity role labeling task and incorporating global context.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Biological Named Entity Recognition and Role Labeling via Deep Multi-task Learning
Bioscience is an experimental science. The qualitative and quantitative findings of the biological experiments are often exclusively available in the form of figures in published papers. In this paper, we introduce the SourceData model, which captures a key aspect of the biological experimental design by categorizing biological entity involved in the experiment into one of the six roles. Our work aims at determining whether a given entity is subjected to a perturbation or is the object of a measurement (entity role labeling) through automatic natural language algorithms. We use state-of-the-art transformer models (e.g., Bert and its variants) as a strong baseline, find that after jointly trained with biological named entity recognition task by deep multi-task learning (MTL), the F1 score gets improved by 2% compared to previous single-task architecture. Also, for named entity recognition task, the MTL method achieves comparable performance in five public datasets. Further analysis reveals the importance of fusing entity information at the input layer of entity role labeling task and incorporating global context.