Workshop on Biomedical Natural Language Processing最新文献_第4页

Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings 用词嵌入量化60年来生物医学研究中的性别偏见

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.1

Anthony Rios, Reenam Joshi, Hejin Shin

引用次数: 14

Sequence-to-Set Semantic Tagging for Complex Query Reformulation and Automated Text Categorization in Biomedical IR using Self-Attention 基于自关注的生物医学IR复杂查询改写和自动文本分类的序列到集合语义标记

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.2

Manirupa Das, Juanxi Li, E. Fosler-Lussier, Simon M. Lin, S. Rust, Yungui Huang, R. Ramnath

{"title":"Sequence-to-Set Semantic Tagging for Complex Query Reformulation and Automated Text Categorization in Biomedical IR using Self-Attention","authors":"Manirupa Das, Juanxi Li, E. Fosler-Lussier, Simon M. Lin, S. Rust, Yungui Huang, R. Ramnath","doi":"10.18653/v1/2020.bionlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.2","url":null,"abstract":"Novel contexts, comprising a set of terms referring to one or more concepts, may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature. These may not explicitly refer to entities or canonical concept forms occurring in a fact-based knowledge source, e.g. the UMLS ontology. Moreover, hidden associations between related concepts meaningful in the current context, may not exist within a single document, but across documents in the collection. Predicting semantic concept tags of documents can therefore serve to associate documents related in unseen contexts, or categorize them, in information filtering or retrieval scenarios. Thus, inspired by the success of sequence-to-sequence neural models, we develop a novel sequence-to-set framework with attention, for learning document representations in a unique unsupervised setting, using no human-annotated document labels or external knowledge resources and only corpus-derived term statistics to drive the training, that can effect term transfer within a corpus for semantically tagging a large collection of documents. Our sequence-to-set modeling approach to predict semantic tags, gives to the best of our knowledge, the state-of-the-art for both, an unsupervised query expansion (QE) task for the TREC CDS 2016 challenge dataset when evaluated on an Okapi BM25–based document retrieval system; and also over the MLTM system baseline baseline (Soleimani and Miller, 2016), for both supervised and semi-supervised multi-label prediction tasks on the del.icio.us and Ohsumed datasets. We make our code and data publicly available.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125269219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Improving Biomedical Analogical Retrieval with Embedding of Structural Dependencies 利用结构依赖嵌入改进生物医学类比检索

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.4

Amandalynne Paullada, B. Percha, T. Cohen

{"title":"Improving Biomedical Analogical Retrieval with Embedding of Structural Dependencies","authors":"Amandalynne Paullada, B. Percha, T. Cohen","doi":"10.18653/v1/2020.bionlp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.4","url":null,"abstract":"Inferring the nature of the relationships between biomedical entities from text is an important problem due to the difficulty of maintaining human-curated knowledge bases in rapidly evolving fields. Neural word embeddings have earned attention for an apparent ability to encode relational information. However, word embedding models that disregard syntax during training are limited in their ability to encode the structural relationships fundamental to cognitive theories of analogy. In this paper, we demonstrate the utility of encoding dependency structure in word embeddings in a model we call Embedding of Structural Dependencies (ESD) as a way to represent biomedical relationships in two analogical retrieval tasks: a relationship retrieval (RR) task, and a literature-based discovery (LBD) task meant to hypothesize plausible relationships between pairs of entities unseen in training. We compare our model to skip-gram with negative sampling (SGNS), using 19 databases of biomedical relationships as our evaluation data, with improvements in performance on 17 (LBD) and 18 (RR) of these sets. These results suggest embeddings encoding dependency path information are of value for biomedical analogy retrieval.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131726540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Neural Transduction of Letter Position Dyslexia using an Anagram Matrix Representation 用变位矩阵表示字母位置阅读障碍的神经转导

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.16

A. Bleiweiss

引用次数: 0

A BERT-based One-Pass Multi-Task Model for Clinical Temporal Relation Extraction 基于bert的临床时间关系提取一遍多任务模型

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.7

Chen Lin, Timothy A. Miller, Dmitriy Dligach, Farig Sadeque, Steven Bethard, G. Savova

引用次数: 16

Towards Visual Dialog for Radiology 面向放射学的视觉对话

Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.6

Olga Kovaleva, Chaitanya P. Shivade, Satyananda Kashyap, Karina Kanjaria, Joy T. Wu, Deddeh Ballah, Adam Coy, A. Karargyris, Yufan Guo, D. Beymer, Anna Rumshisky, Vandana V. Mukherjee

引用次数: 13

Extensive Error Analysis and a Learning-Based Evaluation of Medical Entity Recognition Systems to Approximate User Experience 医疗实体识别系统的广泛误差分析和基于学习的评估以接近用户体验

Workshop on Biomedical Natural Language Processing Pub Date : 2020-06-01 DOI: 10.18653/v1/2020.bionlp-1.19

I. Nejadgholi, Kathleen C. Fraser, Berry de Bruijn

{"title":"Extensive Error Analysis and a Learning-Based Evaluation of Medical Entity Recognition Systems to Approximate User Experience","authors":"I. Nejadgholi, Kathleen C. Fraser, Berry de Bruijn","doi":"10.18653/v1/2020.bionlp-1.19","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.19","url":null,"abstract":"When comparing entities extracted by a medical entity recognition system with gold standard annotations over a test set, two types of mismatches might occur, label mismatch or span mismatch. Here we focus on span mismatch and show that its severity can vary from a serious error to a fully acceptable entity extraction due to the subjectivity of span annotations. For a domain-specific BERT-based NER system, we showed that 25% of the errors have the same labels and overlapping span with gold standard entities. We collected expert judgement which shows more than 90% of these mismatches are accepted or partially accepted by the user. Using the training set of the NER system, we built a fast and lightweight entity classifier to approximate the user experience of such mismatches through accepting or rejecting them. The decisions made by this classifier are used to calculate a learning-based F-score which is shown to be a better approximation of a forgiving user’s experience than the relaxed F-score. We demonstrated the results of applying the proposed evaluation metric for a variety of deep learning medical entity recognition models trained with two datasets.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129719484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Entity-Enriched Neural Models for Clinical Question Answering 用于临床问题回答的实体丰富神经模型

Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-13 DOI: 10.18653/v1/2020.bionlp-1.12

Bhanu Pratap Singh Rawat, W. Weng, So Yeon Min, Preethi Raghavan, Peter Szolovits

引用次数: 16

An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining 基于BERT的生物医学文本挖掘多任务学习实证研究

Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-06 DOI: 10.18653/v1/2020.bionlp-1.22

Yifan Peng, Qingyu Chen, Zhiyong Lu

引用次数: 77

Personalized Early Stage Alzheimer’s Disease Detection: A Case Study of President Reagan’s Speeches 个性化的早期阿尔茨海默病检测:里根总统演讲的案例研究

Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-06 DOI: 10.1101/2020.05.01.20087627

N. Wang, F. Luo, V. Peddagangireddy, K.P. Subbalakshmi, R. Chandramouli

{"title":"Personalized Early Stage Alzheimer’s Disease Detection: A Case Study of President Reagan’s Speeches","authors":"N. Wang, F. Luo, V. Peddagangireddy, K.P. Subbalakshmi, R. Chandramouli","doi":"10.1101/2020.05.01.20087627","DOIUrl":"https://doi.org/10.1101/2020.05.01.20087627","url":null,"abstract":"Alzheimer’s disease (AD)-related global healthcare cost is estimated to be $1 trillion by 2050. Currently, there is no cure for this disease; however, clinical studies show that early diagnosis and intervention helps to extend the quality of life and inform technologies for personalized mental healthcare. Clinical research indicates that the onset and progression of Alzheimer’s disease lead to dementia and other mental health issues. As a result, the language capabilities of patient start to decline. In this paper, we show that machine learning-based unsupervised clustering of and anomaly detection with linguistic biomarkers are promising approaches for intuitive visualization and personalized early stage detection of Alzheimer’s disease. We demonstrate this approach on 10 year’s (1980 to 1989) of President Ronald Reagan’s speech data set. Key linguistic biomarkers that indicate early-stage AD are identified. Experimental results show that Reagan had early onset of Alzheimer’s sometime between 1983 and 1987. This finding is corroborated by prior work that analyzed his interviews using a statistical technique. The proposed technique also identifies the exact speeches that reflect linguistic biomarkers for early stage AD.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125596794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3