IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics最新文献

筛选
英文 中文
Mitigating Membership Inference in Deep Survival Analyses with Differential Privacy. 利用差异隐私减轻深度生存分析中的成员推断。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00022
Liyue Fan, Luca Bonomi
{"title":"Mitigating Membership Inference in Deep Survival Analyses with Differential Privacy.","authors":"Liyue Fan, Luca Bonomi","doi":"10.1109/ichi57859.2023.00022","DOIUrl":"10.1109/ichi57859.2023.00022","url":null,"abstract":"<p><p>Deep neural networks have been increasingly integrated in healthcare applications to enable accurate predicative analyses. Sharing trained deep models not only facilitates knowledge integration in collaborative research efforts but also enables equitable access to computational intelligence. However, recent studies have shown that an adversary may leverage a shared model to learn the participation of a target individual in the training set. In this work, we investigate privacy-protecting model sharing for survival studies. Specifically, we pose three research questions. (1) Do deep survival models leak membership information? (2) How effective is differential privacy in defending against membership inference in deep survival analyses? (3) Are there other effects of differential privacy on deep survival analyses? Our study assesses the membership leakage in emerging deep survival models and develops differentially private training procedures to provide rigorous privacy protection. The experimental results show that deep survival models leak membership information and our approach effectively reduces membership inference risks. The results also show that differential privacy introduces a limited performance loss, and may improve the model robustness in the presence of noisy data, compared to non-private models.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10751041/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139049861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An LSTM-based Gesture-to-Speech Recognition System. 基于 LSTM 的手势语音识别系统
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00062
Riyad Bin Rafiq, Syed Araib Karim, Mark V Albert
{"title":"An LSTM-based Gesture-to-Speech Recognition System.","authors":"Riyad Bin Rafiq, Syed Araib Karim, Mark V Albert","doi":"10.1109/ichi57859.2023.00062","DOIUrl":"10.1109/ichi57859.2023.00062","url":null,"abstract":"<p><p>Fast and flexible communication options are limited for speech-impaired people. Hand gestures coupled with fast, generated speech can enable a more natural social dynamic for those individuals - particularly individuals without the fine motor skills to type on a keyboard or tablet reliably. We created a mobile phone application prototype that generates audible responses associated with trained hand movements and collects and organizes the accelerometer data for rapid training to allow tailored models for individuals who may not be able to perform standard movements such as sign language. Six participants performed 11 distinct gestures to produce the dataset. A mobile application was developed that integrated a bidirectional LSTM network architecture which was trained from this data. After evaluation using nested subject-wise cross-validation, our integrated bidirectional LSTM model demonstrates an overall recall of 91.8% in recognition of these pre-selected 11 hand gestures, with recall at 95.8% when two commonly confused gestures were not assessed. This prototype is a step in creating a mobile phone system capable of capturing new gestures and developing tailored gesture recognition models for individuals in speech-impaired populations. Further refinement of this prototype can enable fast and efficient communication with the goal of further improving social interaction for individuals unable to speak.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10894657/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139974844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking Transformer-Based Models for Identifying Social Determinants of Health in Clinical Notes. 在临床笔记中识别健康的社会决定因素的基于变压器的模型基准。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00102
Xiaoyu Wang, Dipankar Gupta, Michael Killian, Zhe He
{"title":"Benchmarking Transformer-Based Models for Identifying Social Determinants of Health in Clinical Notes.","authors":"Xiaoyu Wang, Dipankar Gupta, Michael Killian, Zhe He","doi":"10.1109/ichi57859.2023.00102","DOIUrl":"10.1109/ichi57859.2023.00102","url":null,"abstract":"<p><p>Electronic health records (EHR) have been widely used in building machine learning models for health outcomes prediction. However, many EHR-based models are inherently biased due to lack of risk factors on social determinants of health (SDoH), which are responsible for up to 40% preventive deaths. As SDoH information is often captured in clinical notes, recent efforts have been made to extract such information from notes with natural language processing and append it to other structured data. In this work, we benchmark 7 pre-trained transformer-based models, including BERT, ALBERT, BioBERT, BioClinicalBERT, RoBERTa, ELECTRA, and RoBERTa-MIMIC-Trial, for recognizing SDoH terms using a previously annotated corpus of MIMIC-III clinical notes. Our study shows that BioClinicalBERT model performs best on F-1 scores (0.911, 0.923) under both strict and relaxed criteria. This work shows the promise of using transformer-based models for recognizing SDoH information from clinical notes.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10795706/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139492901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An End-to-end In-Silico and In-Vitro Drug Repurposing Pipeline for Glioblastoma. 针对胶质母细胞瘤的端到端硅内和体外药物再利用管道。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00135
Ko-Hong Lin, Jay-Jiguang Zhu, Judith A Smith, Yejin Kim, Xiaoqian Jiang
{"title":"An End-to-end <i>In-Silico</i> and <i>In-Vitro</i> Drug Repurposing Pipeline for Glioblastoma.","authors":"Ko-Hong Lin, Jay-Jiguang Zhu, Judith A Smith, Yejin Kim, Xiaoqian Jiang","doi":"10.1109/ichi57859.2023.00135","DOIUrl":"10.1109/ichi57859.2023.00135","url":null,"abstract":"<p><p>Our study aims to address the challenges in drug development for glioblastoma, a highly aggressive brain cancer with poor prognosis. We propose a computational framework that utilizes machine learning-based propensity score matching to estimate counterfactual treatment effects and predict synergistic effects of drug combinations. Through our <i>in-silico</i> analysis, we identified promising drug candidates and drug combinations that warrant further investigation. To validate these computational findings, we conducted <i>in-vitro</i> experiments on two GBM cell lines, U87 and T98G. The experimental results demonstrated that some of the identified drugs and drug combinations indeed exhibit strong suppressive effects on GBM cell growth. Our end-to-end pipeline showcases the feasibility of integrating computational models with biological experiments to expedite drug repurposing and discovery efforts. By bridging the gap between <i>in-silico</i> analysis and <i>in-vitro</i> validation, we demonstrate the potential of this approach to accelerate the development of novel and effective treatments for glioblastoma.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10956733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140186468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Prediction of Late Symptoms using LSTM and Patient-reported Outcomes for Head and Neck Cancer Patients. 利用 LSTM 和患者报告结果改进头颈癌患者晚期症状的预测。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00047
Yaohua Wang, Lisanne Van Dijk, Abdallah S R Mohamed, Mohamed Naser, Clifton David Fuller, Xinhua Zhang, G Elisabeta Marai, Guadalupe Canahuate
{"title":"Improving Prediction of Late Symptoms using LSTM and Patient-reported Outcomes for Head and Neck Cancer Patients.","authors":"Yaohua Wang, Lisanne Van Dijk, Abdallah S R Mohamed, Mohamed Naser, Clifton David Fuller, Xinhua Zhang, G Elisabeta Marai, Guadalupe Canahuate","doi":"10.1109/ichi57859.2023.00047","DOIUrl":"10.1109/ichi57859.2023.00047","url":null,"abstract":"<p><p>Patient-Reported Outcomes (PRO) are collected directly from the patients using symptom questionnaires. In the case of head and neck cancer patients, PRO surveys are recorded every week during treatment with each patient's visit to the clinic and at different follow-up times after the treatment has concluded. PRO surveys can be very informative regarding the patient's status and the effect of treatment on the patient's quality of life (QoL). Processing PRO data is challenging for several reasons. First, missing data is frequent as patients might skip a question or a questionnaire altogether. Second, PROs are patient-dependent, a rating of 5 for one patient might be a rating of 10 for another patient. Finally, most patients experience severe symptoms during treatment which usually subside over time. However, for some patients, late toxicities persist negatively affecting the patient's QoL. These long-term severe symptoms are hard to predict and are the focus of this study. In this work, we model PRO data collected from head and neck cancer patients treated at the MD Anderson Cancer Center using the MD Anderson Symptom Inventory (MDASI) questionnaire as time series. We impute missing values with a combination of K nearest neighbor (KNN) and Long Short-Term Memory (LSTM) neural networks, and finally, apply LSTM to predict late symptom severity 12 months after treatment. We compare performance against clinical and ARIMA models. We show that the LSTM model combined with KNN imputation is effective in predicting late-stage symptom ratings for occurrence and severity under the AUC and F1 score metrics.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10853990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139725194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CareD: Caregiver's Experience with Cognitive Decline in Reddit Posts. CareD:照顾者在 Reddit 帖子中对认知能力衰退的体验。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00104
Muskan Garg, Sunghwan Sohn
{"title":"CareD: Caregiver's Experience with Cognitive Decline in Reddit Posts.","authors":"Muskan Garg, Sunghwan Sohn","doi":"10.1109/ichi57859.2023.00104","DOIUrl":"10.1109/ichi57859.2023.00104","url":null,"abstract":"<p><p>With advancements in analysis of cognitive decline in electronic health records, the research community witnesses a recent surge in social media posting by caregivers and/or loved ones of people with cognitive decline. The major challenges in this area are availability of large and diverse datasets, ethics of data collection and sharing, diagnostic specificity and clinical acceptability. To this end, we construct a new dataset, Caregivers experiences with cognitive Decline (CareD), of 1005 posts with more than 194K words and 9541 sentences, highlighting discussions on people with dementia and Alzheimer's disease on Reddit. We discuss the changing trends of discussions on cognitive decline in social media and open challenges for natural language processing and social computing. We first identify the Reddit posts reflecting substantial information as candidate posts. We further formulate the annotation guidelines, handle perplexities to investigate the existence of experiences, self-reported articles and potential caregiver in candidate posts, resulting in the discovery of latent symptoms, firsthand information, and prospective source of longitudinal information about the patient, respectively.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10877621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139934508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End n-ary Relation Extraction for Combination Drug Therapies. 联合药物疗法的端到端 nary 关系提取。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00021
Yuhang Jiang, Ramakanth Kavuluru
{"title":"End-to-End <i>n</i>-ary Relation Extraction for Combination Drug Therapies.","authors":"Yuhang Jiang, Ramakanth Kavuluru","doi":"10.1109/ichi57859.2023.00021","DOIUrl":"10.1109/ichi57859.2023.00021","url":null,"abstract":"<p><p>Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the <b>combination drug therapy</b> MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an <i>n</i>-ary relation extraction problem. Unlike in the general <i>n</i>-ary setting where <i>n</i> is fixed (e.g., drug-gene-mutation relations where <i>n</i> = 3), extracting combination therapies is a special setting where <i>n</i> ≥ 2 is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, <b>CombDrugExt</b>, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of 66.7% on the <b>CombDrugExt</b> test set for positive (or effective) combinations. This is an absolute <i>≈</i> 5% F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic <i>n</i>-ary extraction scenarios.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10814995/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139571682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring Personalized Treatment Effect of Antihypertensives on Alzheimer's Disease Using Deep Learning. 利用深度学习推断抗高血压药对阿尔茨海默病的个性化治疗效果
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00018
Pulakesh Upadhyaya, Yaobin Ling, Luyao Chen, Yejin Kim, Xiaoqian Jiang
{"title":"Inferring Personalized Treatment Effect of Antihypertensives on Alzheimer's Disease Using Deep Learning.","authors":"Pulakesh Upadhyaya, Yaobin Ling, Luyao Chen, Yejin Kim, Xiaoqian Jiang","doi":"10.1109/ichi57859.2023.00018","DOIUrl":"10.1109/ichi57859.2023.00018","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is one of the leading causes of death in the United States, especially among the elderly. Recent studies have shown how hypertension is related to cognitive decline in elderly patients, which in turn leads to increased mortality as well as morbidity. There have been various studies that have looked at the effect of antihypertensive drugs in reducing cognitive decline, and their results have proved inconclusive. However, most of these studies assume the treatment effect is similar for all patients, thus considering only the average treatment effects of antihypertensive drugs. In this paper, we assume that the effect of antihypertensives on the onset of AD depends on patient characteristics. We develop a deep learning method called LASSO-Dragonnet to estimate the individualized treatment effects of each patient. We considered six antihypertensive drugs, and each of the six models considered one of the drugs as the treatment and the remaining as control. Our studies showed that although many antihypertensives have a positive impact in delaying AD onset on average, the impact varies from individual to individual, depending on their various characteristics. We also analyzed the importance of various covariates in such an estimation. Our results showed that the individualized treatment effects of each patient could be estimated accurately using a deep learning method, and that the importance of various covariates could be determined.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10956734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140186469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Neural Network Modeling of Web Search Activity for Real-time Pandemic Forecasting. 用于实时流行病预测的网络搜索活动图神经网络模型。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00027
Chen Lin, Jianghong Zhou, Jing Zhang, Carl Yang, Eugene Agichtein
{"title":"Graph Neural Network Modeling of Web Search Activity for Real-time Pandemic Forecasting.","authors":"Chen Lin, Jianghong Zhou, Jing Zhang, Carl Yang, Eugene Agichtein","doi":"10.1109/ichi57859.2023.00027","DOIUrl":"10.1109/ichi57859.2023.00027","url":null,"abstract":"<p><p>The utilization of web search activity for pandemic forecasting has significant implications for managing disease spread and informing policy decisions. However, web search records tend to be noisy and influenced by geographical location, making it difficult to develop large-scale models. While regularized linear models have been effective in predicting the spread of respiratory illnesses like COVID-19, they are limited to specific locations. The lack of incorporation of neighboring areas' data and the inability to transfer models to new locations with limited data has impeded further progress. To address these limitations, this study proposes a novel self-supervised message-passing neural network (SMPNN) framework for modeling local and cross-location dynamics in pandemic forecasting. The SMPNN framework utilizes an MPNN module to learn cross-location dependencies through self-supervised learning and improve local predictions with graph-generated features. The framework is designed as an end-to-end solution and is compared with state-of-the-art statistical and deep learning models using COVID-19 data from England and the US. The results of the study demonstrate that the SMPNN model outperforms other models by achieving up to a 6.9% improvement in prediction accuracy and lower prediction errors during the early stages of disease outbreaks. This approach represents a significant advancement in disease surveillance and forecasting, providing a novel methodology, datasets, and insights that combine web search data and spatial information. The proposed SMPNN framework offers a promising avenue for modeling the spread of pandemics, leveraging both local and cross-location information, and has the potential to inform public health policy decisions.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10853009/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139708630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies. 化学-蛋白质相互作用提取的端到端模型:更好的标记化和基于跨度的管道策略
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00108
Xuguang Ai, Ramakanth Kavuluru
{"title":"End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies.","authors":"Xuguang Ai, Ramakanth Kavuluru","doi":"10.1109/ichi57859.2023.00108","DOIUrl":"10.1109/ichi57859.2023.00108","url":null,"abstract":"<p><p>End-to-end relation extraction (E2ERE) is an important task in information extraction, more so for biomedicine as scientific literature continues to grow exponentially. E2ERE typically involves identifying entities (or named entity recognition (NER)) and associated relations, while most RE tasks simply assume that the entities are provided upfront and end up performing relation classification. E2ERE is inherently more difficult than RE alone given the potential snowball effect of errors from NER leading to more errors in RE. A complex dataset in biomedical E2ERE is the ChemProt dataset (BioCreative VI, 2017) that identifies relations between chemical compounds and genes/proteins in scientific literature. ChemProt is included in all recent biomedical natural language processing benchmarks including BLUE, BLURB, and BigBio. However, its treatment in these benchmarks and in other separate efforts is typically not end-to-end, with few exceptions. In this effort, we employ a span-based pipeline approach to produce a new state-of-the-art E2ERE performance on the ChemProt dataset, resulting in > 4% improvement in F1-score over the prior best effort. Our results indicate that a straightforward fine-grained tokenization scheme helps span-based approaches excel in E2ERE, especially with regards to handling complex named entities. Our error analysis also identifies a few key failure modes in E2ERE for ChemProt.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10809256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139565432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信