基于社交媒体数据的疾病命名实体识别的端到端深度框架

2017 IEEE 30th Neumann Colloquium (NC) Pub Date : 2017-11-01 DOI:10.1109/NC.2017.8263281

Z. Miftahutdinov, E. Tutubalina

{"title":"基于社交媒体数据的疾病命名实体识别的端到端深度框架","authors":"Z. Miftahutdinov, E. Tutubalina","doi":"10.1109/NC.2017.8263281","DOIUrl":null,"url":null,"abstract":"A growing interest in the natural language processing methods applied to healthcare applications has been observed in the recent years. In particular, new drug pharmacological properties can be derived patient observations shared in social media forums. Developing approaches designed to automatically retrieve this information is of no low interest for personalized medicine and wide-scale drug tests. The full potential of the effective exploitation of both textual data and published biological data for drug research often goes untapped mostly because of the lack of tools and focused methodologies to curate and integrate the data and transform it into new, experimentally testable hypotheses. Deep learning architectures have shown promising results for a wide range of tasks. In this work, we propose to address a challenging problem by applying modern deep neural networks for disease named entity recognition. An essential step for this task is recognition of disease mentions and medical concept nor-malization, which is highly difficult with simple string matching approaches. We cast the task as an end-to-end problem, solved using two architectures based on recurrent neural networks and pre-trained word embeddings. We show that it is possible to assess the practicability of using social media data to extract representative medical concepts for pharmacovigilance or drug repurposing.","PeriodicalId":140536,"journal":{"name":"2017 IEEE 30th Neumann Colloquium (NC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"End-to-end deep framework for disease named entity recognition using social media data\",\"authors\":\"Z. Miftahutdinov, E. Tutubalina\",\"doi\":\"10.1109/NC.2017.8263281\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A growing interest in the natural language processing methods applied to healthcare applications has been observed in the recent years. In particular, new drug pharmacological properties can be derived patient observations shared in social media forums. Developing approaches designed to automatically retrieve this information is of no low interest for personalized medicine and wide-scale drug tests. The full potential of the effective exploitation of both textual data and published biological data for drug research often goes untapped mostly because of the lack of tools and focused methodologies to curate and integrate the data and transform it into new, experimentally testable hypotheses. Deep learning architectures have shown promising results for a wide range of tasks. In this work, we propose to address a challenging problem by applying modern deep neural networks for disease named entity recognition. An essential step for this task is recognition of disease mentions and medical concept nor-malization, which is highly difficult with simple string matching approaches. We cast the task as an end-to-end problem, solved using two architectures based on recurrent neural networks and pre-trained word embeddings. We show that it is possible to assess the practicability of using social media data to extract representative medical concepts for pharmacovigilance or drug repurposing.\",\"PeriodicalId\":140536,\"journal\":{\"name\":\"2017 IEEE 30th Neumann Colloquium (NC)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 30th Neumann Colloquium (NC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NC.2017.8263281\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 30th Neumann Colloquium (NC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NC.2017.8263281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

近年来，人们对应用于医疗保健应用的自然语言处理方法越来越感兴趣。特别是，新的药物药理特性可以从社交媒体论坛上分享的患者观察中得出。开发旨在自动检索这些信息的方法对于个性化医疗和大规模药物测试来说并不低。有效利用文本数据和已发表的生物数据用于药物研究的全部潜力往往没有得到开发，主要是因为缺乏工具和重点方法来管理和整合数据，并将其转化为新的、实验可测试的假设。深度学习架构已经在广泛的任务中显示出有希望的结果。在这项工作中，我们提出通过应用现代深度神经网络进行疾病命名实体识别来解决一个具有挑战性的问题。该任务的关键步骤是疾病提及和医学概念归一化的识别，这在简单的字符串匹配方法中是非常困难的。我们将任务作为端到端问题，使用基于循环神经网络和预训练词嵌入的两种架构来解决。我们表明，有可能评估使用社交媒体数据提取药物警戒或药物再利用的代表性医学概念的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

End-to-end deep framework for disease named entity recognition using social media data

A growing interest in the natural language processing methods applied to healthcare applications has been observed in the recent years. In particular, new drug pharmacological properties can be derived patient observations shared in social media forums. Developing approaches designed to automatically retrieve this information is of no low interest for personalized medicine and wide-scale drug tests. The full potential of the effective exploitation of both textual data and published biological data for drug research often goes untapped mostly because of the lack of tools and focused methodologies to curate and integrate the data and transform it into new, experimentally testable hypotheses. Deep learning architectures have shown promising results for a wide range of tasks. In this work, we propose to address a challenging problem by applying modern deep neural networks for disease named entity recognition. An essential step for this task is recognition of disease mentions and medical concept nor-malization, which is highly difficult with simple string matching approaches. We cast the task as an end-to-end problem, solved using two architectures based on recurrent neural networks and pre-trained word embeddings. We show that it is possible to assess the practicability of using social media data to extract representative medical concepts for pharmacovigilance or drug repurposing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE 30th Neumann Colloquium (NC)

自引率

0.00%

发文量