Workshop on Biomedical Natural Language Processing最新文献

筛选
英文 中文
Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings 用词嵌入量化60年来生物医学研究中的性别偏见
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.1
Anthony Rios, Reenam Joshi, Hejin Shin
{"title":"Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings","authors":"Anthony Rios, Reenam Joshi, Hejin Shin","doi":"10.18653/v1/2020.bionlp-1.1","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.1","url":null,"abstract":"Gender bias in biomedical research can have an adverse impact on the health of real people. For example, there is evidence that heart disease-related funded research generally focuses on men. Health disparities can form between men and at-risk groups of women (i.e., elderly and low-income) if there is not an equal number of heart disease-related studies for both genders. In this paper, we study temporal bias in biomedical research articles by measuring gender differences in word embeddings. Specifically, we address multiple questions, including, How has gender bias changed over time in biomedical research, and what health-related concepts are the most biased? Overall, we find that traditional gender stereotypes have reduced over time. However, we also find that the embeddings of many medical conditions are as biased today as they were 60 years ago (e.g., concepts related to drug addiction and body dysmorphia).","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131874850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Sequence-to-Set Semantic Tagging for Complex Query Reformulation and Automated Text Categorization in Biomedical IR using Self-Attention 基于自关注的生物医学IR复杂查询改写和自动文本分类的序列到集合语义标记
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.2
Manirupa Das, Juanxi Li, E. Fosler-Lussier, Simon M. Lin, S. Rust, Yungui Huang, R. Ramnath
{"title":"Sequence-to-Set Semantic Tagging for Complex Query Reformulation and Automated Text Categorization in Biomedical IR using Self-Attention","authors":"Manirupa Das, Juanxi Li, E. Fosler-Lussier, Simon M. Lin, S. Rust, Yungui Huang, R. Ramnath","doi":"10.18653/v1/2020.bionlp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.2","url":null,"abstract":"Novel contexts, comprising a set of terms referring to one or more concepts, may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature. These may not explicitly refer to entities or canonical concept forms occurring in a fact-based knowledge source, e.g. the UMLS ontology. Moreover, hidden associations between related concepts meaningful in the current context, may not exist within a single document, but across documents in the collection. Predicting semantic concept tags of documents can therefore serve to associate documents related in unseen contexts, or categorize them, in information filtering or retrieval scenarios. Thus, inspired by the success of sequence-to-sequence neural models, we develop a novel sequence-to-set framework with attention, for learning document representations in a unique unsupervised setting, using no human-annotated document labels or external knowledge resources and only corpus-derived term statistics to drive the training, that can effect term transfer within a corpus for semantically tagging a large collection of documents. Our sequence-to-set modeling approach to predict semantic tags, gives to the best of our knowledge, the state-of-the-art for both, an unsupervised query expansion (QE) task for the TREC CDS 2016 challenge dataset when evaluated on an Okapi BM25–based document retrieval system; and also over the MLTM system baseline baseline (Soleimani and Miller, 2016), for both supervised and semi-supervised multi-label prediction tasks on the del.icio.us and Ohsumed datasets. We make our code and data publicly available.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125269219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Improving Biomedical Analogical Retrieval with Embedding of Structural Dependencies 利用结构依赖嵌入改进生物医学类比检索
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.4
Amandalynne Paullada, B. Percha, T. Cohen
{"title":"Improving Biomedical Analogical Retrieval with Embedding of Structural Dependencies","authors":"Amandalynne Paullada, B. Percha, T. Cohen","doi":"10.18653/v1/2020.bionlp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.4","url":null,"abstract":"Inferring the nature of the relationships between biomedical entities from text is an important problem due to the difficulty of maintaining human-curated knowledge bases in rapidly evolving fields. Neural word embeddings have earned attention for an apparent ability to encode relational information. However, word embedding models that disregard syntax during training are limited in their ability to encode the structural relationships fundamental to cognitive theories of analogy. In this paper, we demonstrate the utility of encoding dependency structure in word embeddings in a model we call Embedding of Structural Dependencies (ESD) as a way to represent biomedical relationships in two analogical retrieval tasks: a relationship retrieval (RR) task, and a literature-based discovery (LBD) task meant to hypothesize plausible relationships between pairs of entities unseen in training. We compare our model to skip-gram with negative sampling (SGNS), using 19 databases of biomedical relationships as our evaluation data, with improvements in performance on 17 (LBD) and 18 (RR) of these sets. These results suggest embeddings encoding dependency path information are of value for biomedical analogy retrieval.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131726540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Neural Transduction of Letter Position Dyslexia using an Anagram Matrix Representation 用变位矩阵表示字母位置阅读障碍的神经转导
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.16
A. Bleiweiss
{"title":"Neural Transduction of Letter Position Dyslexia using an Anagram Matrix Representation","authors":"A. Bleiweiss","doi":"10.18653/v1/2020.bionlp-1.16","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.16","url":null,"abstract":"Research on analyzing reading patterns of dyslectic children has mainly been driven by classifying dyslexia types offline. We contend that a framework to remedy reading errors inline is more far-reaching and will help to further advance our understanding of this impairment. In this paper, we propose a simple and intuitive neural model to reinstate migrating words that transpire in letter position dyslexia, a visual analysis deficit to the encoding of character order within a word. Introduced by the anagram matrix representation of an input verse, the novelty of our work lies in the expansion from one to a two dimensional context window for training. This warrants words that only differ in the disposition of letters to remain interpreted semantically similar in the embedding space. Subject to the apparent constraints of the self-attention transformer architecture, our model achieved a unigram BLEU score of 40.6 on our reconstructed dataset of the Shakespeare sonnets.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"24 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A BERT-based One-Pass Multi-Task Model for Clinical Temporal Relation Extraction 基于bert的临床时间关系提取一遍多任务模型
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.7
Chen Lin, Timothy A. Miller, Dmitriy Dligach, Farig Sadeque, Steven Bethard, G. Savova
{"title":"A BERT-based One-Pass Multi-Task Model for Clinical Temporal Relation Extraction","authors":"Chen Lin, Timothy A. Miller, Dmitriy Dligach, Farig Sadeque, Steven Bethard, G. Savova","doi":"10.18653/v1/2020.bionlp-1.7","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.7","url":null,"abstract":"Recently BERT has achieved a state-of-the-art performance in temporal relation extraction from clinical Electronic Medical Records text. However, the current approach is inefficient as it requires multiple passes through each input sequence. We extend a recently-proposed one-pass model for relation classification to a one-pass model for relation extraction. We augment this framework by introducing global embeddings to help with long-distance relation inference, and by multi-task learning to increase model performance and generalizability. Our proposed model produces results on par with the state-of-the-art in temporal relation extraction on the THYME corpus and is much “greener” in computational cost.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115585597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Towards Visual Dialog for Radiology 面向放射学的视觉对话
Workshop on Biomedical Natural Language Processing Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.6
Olga Kovaleva, Chaitanya P. Shivade, Satyananda Kashyap, Karina Kanjaria, Joy T. Wu, Deddeh Ballah, Adam Coy, A. Karargyris, Yufan Guo, D. Beymer, Anna Rumshisky, Vandana V. Mukherjee
{"title":"Towards Visual Dialog for Radiology","authors":"Olga Kovaleva, Chaitanya P. Shivade, Satyananda Kashyap, Karina Kanjaria, Joy T. Wu, Deddeh Ballah, Adam Coy, A. Karargyris, Yufan Guo, D. Beymer, Anna Rumshisky, Vandana V. Mukherjee","doi":"10.18653/v1/2020.bionlp-1.6","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.6","url":null,"abstract":"Current research in machine learning for radiology is focused mostly on images. There exists limited work in investigating intelligent interactive systems for radiology. To address this limitation, we introduce a realistic and information-rich task of Visual Dialog in radiology, specific to chest X-ray images. Using MIMIC-CXR, an openly available database of chest X-ray images, we construct both a synthetic and a real-world dataset and provide baseline scores achieved by state-of-the-art models. We show that incorporating medical history of the patient leads to better performance in answering questions as opposed to conventional visual question answering model which looks only at the image. While our experiments show promising results, they indicate that the task is extremely challenging with significant scope for improvement. We make both the datasets (synthetic and gold standard) and the associated code publicly available to the research community.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127152833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Extensive Error Analysis and a Learning-Based Evaluation of Medical Entity Recognition Systems to Approximate User Experience 医疗实体识别系统的广泛误差分析和基于学习的评估以接近用户体验
Workshop on Biomedical Natural Language Processing Pub Date : 2020-06-01 DOI: 10.18653/v1/2020.bionlp-1.19
I. Nejadgholi, Kathleen C. Fraser, Berry de Bruijn
{"title":"Extensive Error Analysis and a Learning-Based Evaluation of Medical Entity Recognition Systems to Approximate User Experience","authors":"I. Nejadgholi, Kathleen C. Fraser, Berry de Bruijn","doi":"10.18653/v1/2020.bionlp-1.19","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.19","url":null,"abstract":"When comparing entities extracted by a medical entity recognition system with gold standard annotations over a test set, two types of mismatches might occur, label mismatch or span mismatch. Here we focus on span mismatch and show that its severity can vary from a serious error to a fully acceptable entity extraction due to the subjectivity of span annotations. For a domain-specific BERT-based NER system, we showed that 25% of the errors have the same labels and overlapping span with gold standard entities. We collected expert judgement which shows more than 90% of these mismatches are accepted or partially accepted by the user. Using the training set of the NER system, we built a fast and lightweight entity classifier to approximate the user experience of such mismatches through accepting or rejecting them. The decisions made by this classifier are used to calculate a learning-based F-score which is shown to be a better approximation of a forgiving user’s experience than the relaxed F-score. We demonstrated the results of applying the proposed evaluation metric for a variety of deep learning medical entity recognition models trained with two datasets.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129719484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Entity-Enriched Neural Models for Clinical Question Answering 用于临床问题回答的实体丰富神经模型
Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-13 DOI: 10.18653/v1/2020.bionlp-1.12
Bhanu Pratap Singh Rawat, W. Weng, So Yeon Min, Preethi Raghavan, Peter Szolovits
{"title":"Entity-Enriched Neural Models for Clinical Question Answering","authors":"Bhanu Pratap Singh Rawat, W. Weng, So Yeon Min, Preethi Raghavan, Peter Szolovits","doi":"10.18653/v1/2020.bionlp-1.12","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.12","url":null,"abstract":"We explore state-of-the-art neural models for question answering on electronic medical records and improve their ability to generalize better on previously unseen (paraphrased) questions at test time. We enable this by learning to predict logical forms as an auxiliary task along with the main task of answer span detection. The predicted logical forms also serve as a rationale for the answer. Further, we also incorporate medical entity information in these models via the ERNIE architecture. We train our models on the large-scale emrQA dataset and observe that our multi-task entity-enriched models generalize to paraphrased questions ~5% better than the baseline BERT model.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128448695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining 基于BERT的生物医学文本挖掘多任务学习实证研究
Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-06 DOI: 10.18653/v1/2020.bionlp-1.22
Yifan Peng, Qingyu Chen, Zhiyong Lu
{"title":"An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining","authors":"Yifan Peng, Qingyu Chen, Zhiyong Lu","doi":"10.18653/v1/2020.bionlp-1.22","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.22","url":null,"abstract":"Multi-task learning (MTL) has achieved remarkable success in natural language processing applications. In this work, we study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language processing tasks such as text similarity, relation extraction, named entity recognition, and text inference. Our empirical results demonstrate that the MTL fine-tuned models outperform state-of-the-art transformer models (e.g., BERT and its variants) by 2.0% and 1.3% in biomedical and clinical domain adaptation, respectively. Pairwise MTL further demonstrates more details about which tasks can improve or decrease others. This is particularly helpful in the context that researchers are in the hassle of choosing a suitable model for new problems. The code and models are publicly available at https://github.com/ncbi-nlp/bluebert.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131918372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Personalized Early Stage Alzheimer’s Disease Detection: A Case Study of President Reagan’s Speeches 个性化的早期阿尔茨海默病检测:里根总统演讲的案例研究
Workshop on Biomedical Natural Language Processing Pub Date : 2020-05-06 DOI: 10.1101/2020.05.01.20087627
N. Wang, F. Luo, V. Peddagangireddy, K.P. Subbalakshmi, R. Chandramouli
{"title":"Personalized Early Stage Alzheimer’s Disease Detection: A Case Study of President Reagan’s Speeches","authors":"N. Wang, F. Luo, V. Peddagangireddy, K.P. Subbalakshmi, R. Chandramouli","doi":"10.1101/2020.05.01.20087627","DOIUrl":"https://doi.org/10.1101/2020.05.01.20087627","url":null,"abstract":"Alzheimer’s disease (AD)-related global healthcare cost is estimated to be $1 trillion by 2050. Currently, there is no cure for this disease; however, clinical studies show that early diagnosis and intervention helps to extend the quality of life and inform technologies for personalized mental healthcare. Clinical research indicates that the onset and progression of Alzheimer’s disease lead to dementia and other mental health issues. As a result, the language capabilities of patient start to decline. In this paper, we show that machine learning-based unsupervised clustering of and anomaly detection with linguistic biomarkers are promising approaches for intuitive visualization and personalized early stage detection of Alzheimer’s disease. We demonstrate this approach on 10 year’s (1980 to 1989) of President Ronald Reagan’s speech data set. Key linguistic biomarkers that indicate early-stage AD are identified. Experimental results show that Reagan had early onset of Alzheimer’s sometime between 1983 and 1987. This finding is corroborated by prior work that analyzed his interviews using a statistical technique. The proposed technique also identifies the exact speeches that reflect linguistic biomarkers for early stage AD.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125596794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信