Using deep neural network to recognize mutation entities in biomedical literature

Fan Tong, Zheheng Luo, Dongsheng Zhao
{"title":"Using deep neural network to recognize mutation entities in biomedical literature","authors":"Fan Tong, Zheheng Luo, Dongsheng Zhao","doi":"10.1109/BIBM.2018.8621134","DOIUrl":null,"url":null,"abstract":"Automatic recognizing mutation mentions plays a fundamental and critical role in extracting variant-disease relation from biomedical literature. In this paper, we proposed an advanced model for mutation mentions detection by using deep network in combination with decoding algorithm and regular expression. Inspired by the distributed representation of words and characters, we divide each word by letters of difference case, numbers and special characters into tokens for training a token embedding which can capture some nomenclature features of mutations. To build the network, we implemented Bi-directional LSTM (long short-term memory) layers to learn a general form of mutation mentions while capture long-term context information and fully-connected layers to improve the fitting capability, using concatenation of word vectors training from token embeddings as the input. Viterbi algorithm was used to decode the previous output to access initial labeled sequence. On top of that, regular expression patterns were used to label the mutation mentions, which provided extra information to optimize the initial output. While training and testing on NCBI tmVar mutation corpus, our model achieved F-score of 91.59% which performed better than current reported systems.","PeriodicalId":108667,"journal":{"name":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"12 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2018.8621134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Automatic recognizing mutation mentions plays a fundamental and critical role in extracting variant-disease relation from biomedical literature. In this paper, we proposed an advanced model for mutation mentions detection by using deep network in combination with decoding algorithm and regular expression. Inspired by the distributed representation of words and characters, we divide each word by letters of difference case, numbers and special characters into tokens for training a token embedding which can capture some nomenclature features of mutations. To build the network, we implemented Bi-directional LSTM (long short-term memory) layers to learn a general form of mutation mentions while capture long-term context information and fully-connected layers to improve the fitting capability, using concatenation of word vectors training from token embeddings as the input. Viterbi algorithm was used to decode the previous output to access initial labeled sequence. On top of that, regular expression patterns were used to label the mutation mentions, which provided extra information to optimize the initial output. While training and testing on NCBI tmVar mutation corpus, our model achieved F-score of 91.59% which performed better than current reported systems.
利用深度神经网络识别生物医学文献中的突变实体
突变提及的自动识别是生物医学文献中变异-疾病关系提取的基础和关键。本文提出了一种将深度网络与解码算法和正则表达式相结合的高级突变提及检测模型。受单词和字符的分布式表示的启发,我们将每个单词按不同大小写、数字和特殊字符的字母划分为标记,用于训练标记嵌入,该标记嵌入可以捕获突变的一些命名特征。为了构建网络,我们实现了双向LSTM(长短期记忆)层来学习突变提及的一般形式,同时捕获长期上下文信息和完全连接层来提高拟合能力,使用来自令牌嵌入的词向量训练的连接作为输入。使用Viterbi算法对之前的输出进行解码,得到初始标记序列。最重要的是,正则表达式模式用于标记提到的突变,这为优化初始输出提供了额外的信息。在NCBI tmVar突变语料库上进行训练和测试时,该模型的f值为91.59%,优于现有报道的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信