IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics最新文献

筛选
英文 中文
Prediction of COVID-19 Patients' Emergency Room Revisit using Multi-Source Transfer Learning. 利用多源迁移学习预测 COVID-19 患者的急诊室复诊率。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ICHI57859.2023.00028
Yuelyu Ji, Yuhe Gao, Runxue Bao, Qi Li, Disheng Liu, Yiming Sun, Ye Ye
{"title":"Prediction of COVID-19 Patients' Emergency Room Revisit using Multi-Source Transfer Learning.","authors":"Yuelyu Ji, Yuhe Gao, Runxue Bao, Qi Li, Disheng Liu, Yiming Sun, Ye Ye","doi":"10.1109/ICHI57859.2023.00028","DOIUrl":"10.1109/ICHI57859.2023.00028","url":null,"abstract":"<p><p>The coronavirus disease 2019 (COVID-19) has led to a global pandemic of significant severity. In addition to its high level of contagiousness, COVID-19 can have a heterogeneous clinical course, ranging from asymptomatic carriers to severe and potentially life-threatening health complications. Many patients have to revisit the emergency room (ER) within a short time after discharge, which significantly increases the workload for medical staff. Early identification of such patients is crucial for helping physicians focus on treating life-threatening cases. In this study, we obtained Electronic Health Records (EHRs) of 3,210 encounters from 13 affiliated ERs within the University of Pittsburgh Medical Center between March 2020 and January 2021. We leveraged a Natural Language Processing technique, ScispaCy, to extract clinical concepts and used the 1001 most frequent concepts to develop 7-day revisit models for COVID-19 patients in ERs. The research data we collected were obtained from 13 ERs, which may have distributional differences that could affect the model development. To address this issue, we employed a classic deep transfer learning method called the Domain Adversarial Neural Network (DANN) and evaluated different modeling strategies, including the Multi-DANN algorithm (which considers the source differences), the Single-DANN algorithm (which doesn't consider the source differences), and three baseline methods: using only source data, using only target data, and using a mixture of source and target data. Results showed that the Multi-DANN models outperformed the Single-DANN models and baseline models in predicting revisits of COVID-19 patients to the ER within 7 days after discharge (median AUROC = 0.8 vs. 0.5). Notably, the Multi-DANN strategy effectively addressed the heterogeneity among multiple source domains and improved the adaptation of source data to the target domain. Moreover, the high performance of Multi-DANN models indicates that EHRs are informative for developing a prediction model to identify COVID-19 patients who are very likely to revisit an ER within 7 days after discharge.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10939709/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140133379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Named Entity Recognition and Normalization for Alzheimer's Disease Eligibility Criteria. 阿尔茨海默病资格标准的命名实体识别和规范化。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00100
Zenan Sun, Cui Tao
{"title":"Named Entity Recognition and Normalization for Alzheimer's Disease Eligibility Criteria.","authors":"Zenan Sun, Cui Tao","doi":"10.1109/ichi57859.2023.00100","DOIUrl":"10.1109/ichi57859.2023.00100","url":null,"abstract":"<p><p>Alzheimer's Disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide. Finding effective treatments for this disease is crucial. Clinical trials play an essential role in developing and testing new treatments for AD. However, identifying eligible participants can be challenging, time-consuming, and costly. In recent years, the development of natural language processing (NLP) techniques, specifically named entity recognition (NER) and named entity normalization (NEN), have helped to automate the identification and extraction of relevant information from the eligibility criteria (EC) more efficiently, in order to facilitate semi-automatic patient recruitment and enable data FAIRness for clinical trial data. Nevertheless, most current biomedical NER models only provide annotations for a restricted set of entity types that may not be applicable to the clinical trial data. Additionally, accurately performing NEN on entities that are negated using a negative prefix currently lacks established techniques. In this paper, we introduce a pipeline designed for information extraction from AD clinical trial EC, which involves preprocessing of the EC data, clinical NER, and biomedical NEN to Unified Medical Language System (UMLS). Our NER model can identify named entities in seven pre-defined categories, while our NEN model employs a combination of exact match and partial match search strategies, as well as customized rules to accurately normalize entities with negative prefixes. To evaluate the performance of our pipeline, we measured the precision, recall, and F1 score for the NER component, and we manually reviewed the top five mapping results produced by the NEN component. Our evaluation of the pipeline's performance revealed that it can successfully normalize named entities in clinical trial ECs with optimal accuracies. The NER component achieved a overall F1 of 0.816, demonstrating its ability to accurately identify seven types of named entities in clinical text. The NEN component of the pipeline also demonstrated impressive performance, with customized rules and a combination of exact and partial match strategies leading to an accuracy of 0.940 for normalized entities.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10815931/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139571763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Effect of Eligibility Criteria on AD Severity and Severe Adverse Event in Eligible Patients. 探讨合格标准对符合条件的患者的注意力缺失严重程度和严重不良事件的影响。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI: 10.1109/ichi57859.2023.00139
Aokun Chen, Qian Li, Elizabeth Shenkman, Yonghui Wu, Yi Guo, Jiang Bian
{"title":"Exploring the Effect of Eligibility Criteria on AD Severity and Severe Adverse Event in Eligible Patients.","authors":"Aokun Chen, Qian Li, Elizabeth Shenkman, Yonghui Wu, Yi Guo, Jiang Bian","doi":"10.1109/ichi57859.2023.00139","DOIUrl":"10.1109/ichi57859.2023.00139","url":null,"abstract":"<p><p>Clinical trials were vital tools to prove the effectiveness and safety of medications. To maximize generalizability, the study sample should represent the sample population and the target population. However, the clinical trial design tends to favor the evaluation of drug safety and procedure (i.e., internal validity) without clear knowledge of its penalty on trial generalizability (i.e., external validity). Alzheimer's Disease (AD) trials are known to have generalizability issues. Thus, in this study, we explore the effect of eligibility criteria on the AD severity patients and the severe adverse event (SAE) among the eligible patients.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11273173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141790216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Membership Inference in Deep Learning Applications with High Dimensional Genomic Data. 基于高维基因组数据的深度学习应用中的隶属推理缓解。
Chonghao Zhang, Luca Bonomi
{"title":"Mitigating Membership Inference in Deep Learning Applications with High Dimensional Genomic Data.","authors":"Chonghao Zhang,&nbsp;Luca Bonomi","doi":"10.1109/ichi54592.2022.00101","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00101","url":null,"abstract":"<p><p>The use of deep learning techniques in medical applications holds great promises for advancing health care. However, there are growing privacy concerns regarding what information about individual data contributors (i.e., patients in the training set) these deep models may reveal when shared with external users. In this work, we first investigate the membership privacy risks in sharing deep learning models for cancer genomics tasks, and then study the applicability of privacy-protecting strategies for mitigating these privacy risks.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9473339/pdf/nihms-1815588.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10181248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Mining Social Media Data to Predict COVID-19 Case Counts. 挖掘社交媒体数据预测COVID-19病例数
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2022-06-01 Epub Date: 2022-09-08 DOI: 10.1109/ichi54592.2022.00027
Maksims Kazijevs, Furkan A Akyelken, Manar D Samad
{"title":"Mining Social Media Data to Predict COVID-19 Case Counts.","authors":"Maksims Kazijevs,&nbsp;Furkan A Akyelken,&nbsp;Manar D Samad","doi":"10.1109/ichi54592.2022.00027","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00027","url":null,"abstract":"<p><p>The unpredictability and unknowns surrounding the ongoing coronavirus disease (COVID-19) pandemic have led to an unprecedented consequence taking a heavy toll on the lives and economies of all countries. There have been efforts to predict COVID-19 case counts (CCC) using epidemiological data and numerical tokens online, which may allow early preventive measures to slow the spread of the disease. In this paper, we use state-of-the-art natural language processing (NLP) algorithms to numerically encode COVID-19 related tweets originated from eight cities in the United States and predict city-specific CCC up to eight days in the future. A city-embedding is proposed to obtain a time series representation of daily tweets posted from a city, which is then used to predict case counts using a custom long-short term memory (LSTM) model. The universal sentence encoder yields the best normalized root mean squared error (NRMSE) 0.090 (0.039), averaged across all cities in predicting CCC six days in the future. The <i>R</i> <sup>2</sup> scores in predicting CCC are more than 0.70 and often over 0.8, which suggests a strong correlation between the actual and our model predicted CCC values. Our analyses show that the NRMSE and <i>R</i> <sup>2</sup> scores are consistently robust across different cities and different numbers of time steps in time series data. Results show that the LSTM model can learn the mapping between the NLP-encoded tweet semantics and the case counts, which infers that social media text can be directly mined to identify the future course of the pandemic.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9490453/pdf/nihms-1836082.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33477762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sharing Time-to-Event Data with Privacy Protection. 在保护隐私的前提下共享时间到事件数据。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2022-06-01 Epub Date: 2022-09-08 DOI: 10.1109/ichi54592.2022.00014
Luca Bonomi, Liyue Fan
{"title":"Sharing Time-to-Event Data with Privacy Protection.","authors":"Luca Bonomi, Liyue Fan","doi":"10.1109/ichi54592.2022.00014","DOIUrl":"10.1109/ichi54592.2022.00014","url":null,"abstract":"<p><p>Sharing time-to-event data is beneficial for enabling collaborative research efforts (e.g., survival studies), facilitating the design of effective interventions, and advancing patient care (e.g., early diagnosis). Despite numerous privacy solutions for sharing time-to-event data, recent research studies have shown that external information may become available (e.g., self-disclosure of study participation on social media) to an adversary, posing new privacy concerns. In this work, we formulate a cohort inference attack for time-to-event data sharing, in which an informed adversary aims at inferring the membership of a target individual in a specific cohort. Our study investigates the privacy risks associated with time-to-event data and evaluates the empirical privacy protection offered by popular privacy-protecting solutions (e.g., binning, differential privacy). Furthermore, we propose a novel approach to privately release individual level time-to-event data with high utility, while providing indistinguishability guarantees for the input value. Our method TE-Sanitizer is shown to provide effective mitigation against the inference attacks and high usefulness in survival analysis. The results and discussion provide domain experts with insights on the privacy and the usefulness of the studied methods.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9473343/pdf/nihms-1815589.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10181249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of Dementia Signals from Longitudinal Clinical Visits Using One-Class Classification. 利用单类分类从纵向临床访问中检测痴呆症信号
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2022-06-01 Epub Date: 2022-09-08 DOI: 10.1109/ichi54592.2022.00040
Omar A Ibrahim, Sunyang Fu, Maria Vassilaki, Michelle M Mielke, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn
{"title":"Detection of Dementia Signals from Longitudinal Clinical Visits Using One-Class Classification.","authors":"Omar A Ibrahim, Sunyang Fu, Maria Vassilaki, Michelle M Mielke, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn","doi":"10.1109/ichi54592.2022.00040","DOIUrl":"10.1109/ichi54592.2022.00040","url":null,"abstract":"<p><p>Dementia is one of the major health challenges in aging populations, with 50 million people diagnosed worldwide. However, dementia is often underdiagnosed or delayed resulting in missed opportunities for appropriate care plans. Identifying early signs of dementia is essential for better life quality of aging populations. Monitoring early signs of individual health changes could help clinicians diagnose dementia in its early stages with more effective treatment plans. However, rare data for dementia cases compared to the normal (i.e., imbalance class distribution) make it challenging to develop robust supervised learning models. In order to alleviate this issue, we investigated one-class classification (OCC) techniques, which use only majority class (i.e., normal cases) in model development to detect dementia signals from older adult clinical visits. The OCC models identify abnormality of older adults' longitudinal health conditions to predict incident dementia. The predictive performance of the OCC was compared with a recent streaming clustering-based technique and demonstrated higher predictive power. Our analysis showed that OCC has a promising potential to increase power in predicting dementia.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728104/pdf/nihms-1852693.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9328507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparison of few-shot and traditional named entity recognition models for medical text. 医学文本少镜头与传统命名实体识别模型的比较。
Yao Ge, Yuting Guo, Yuan-Chi Yang, Mohammed Ali Al-Garadi, Abeed Sarker
{"title":"A comparison of few-shot and traditional named entity recognition models for medical text.","authors":"Yao Ge,&nbsp;Yuting Guo,&nbsp;Yuan-Chi Yang,&nbsp;Mohammed Ali Al-Garadi,&nbsp;Abeed Sarker","doi":"10.1109/ichi54592.2022.00024","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00024","url":null,"abstract":"<p><p>Many research problems involving medical texts have limited amounts of annotated data available (<i>e.g</i>., expressions of rare diseases). Traditional supervised machine learning algorithms, particularly those based on deep neural networks, require large volumes of annotated data, and they underperform when only small amounts of labeled data are available. Few-shot learning (FSL) is a category of machine learning models that are designed with the intent of solving problems that have small annotated datasets available. However, there is no current study that compares the performances of FSL models with traditional models (<i>e.g</i>., conditional random fields) for medical text at different training set sizes. In this paper, we attempted to fill this gap in research by comparing multiple FSL models with traditional models for the task of named entity recognition (NER) from medical texts. Using five health-related annotated NER datasets, we benchmarked three traditional NER models based on BERT-BERT-Linear Classifier (BLC), BERT-CRF (BC) and SANER; and three FSL NER models-StructShot & NNShot, Few-Shot Slot Tagging (FS-ST) and ProtoNER. Our benchmarking results show that almost all models, whether traditional or FSL, achieve significantly lower performances compared to the state-of-the-art with small amounts of training data. For the NER experiments we executed, the F<sub>1</sub>-scores were very low with small training sets, typically below 30%. FSL models that were reported to perform well on non-medical texts significantly underperformed, compared to their reported best, on medical texts. Our experiments also suggest that FSL methods tend to perform worse on data sets from noisy sources of medical texts, such as social media (which includes misspellings and colloquial expressions), compared to less noisy sources such as medical literature. Our experiments demonstrate that the current state-of-the-art FSL systems are not yet suitable for effective NER in medical natural language processing tasks, and further research needs to be carried out to improve their performances. Creation of specialized, standardized datasets replicating real-world scenarios may help to move this category of methods forward.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10462421/pdf/nihms-1926966.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10186790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Annotating Music Therapy, Chiropractic and Aquatic Exercise Using Electronic Health Record. 使用电子健康记录解说音乐治疗、脊椎按摩和水上运动。
IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2022-06-01 Epub Date: 2022-09-08 DOI: 10.1109/ichi54592.2022.00121
Huixue Zhou, Greg Silverman, Zhongran Niu, Jenzi Silverman, Roni Evans, Robin Austin, Rui Zhang
{"title":"Annotating Music Therapy, Chiropractic and Aquatic Exercise Using Electronic Health Record.","authors":"Huixue Zhou,&nbsp;Greg Silverman,&nbsp;Zhongran Niu,&nbsp;Jenzi Silverman,&nbsp;Roni Evans,&nbsp;Robin Austin,&nbsp;Rui Zhang","doi":"10.1109/ichi54592.2022.00121","DOIUrl":"10.1109/ichi54592.2022.00121","url":null,"abstract":"<p><p>Complementary and Integrative Health (CIH) has gained increasing popularity in the past decades. The overall goal of this study is to represent information pertinent to music therapy, chiropractic and aquatic exercise in an EHR system. A total of 300 clinical notes were randomly selected and manually annotated. Annotations were made for <i>status</i>, <i>symptom</i> and <i>frequency</i> of each approach. This set of annotations was used as a gold standard to evaluate performance of NLP systems used in this study (specifically BioMedICUS, MetaMap and cTAKES) for extracting CIH concepts. Three NLP systems achieved an average lenient match F1-score of 0.50 in all three CIH approaches. BioMedICUS achieved the best performance in music therapy with an F1-score of 0.73. This study is a pilot to investigate CIH representation in clinical note and lays a foundation for using EHR for clinical research for CIH approaches.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10110363/pdf/nihms-1890434.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9751841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models. 使用基于变压器的语言模型的用户评论对药物评级进行分类。
Akhil Shiju, Zhe He
{"title":"Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models.","authors":"Akhil Shiju,&nbsp;Zhe He","doi":"10.1109/ichi54592.2022.00035","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00035","url":null,"abstract":"<p><p>Drug review websites such as Drugs.com provide users' textual reviews and numeric ratings of drugs. These reviews along with the ratings are used for the consumers for choosing a drug. However, the numeric ratings may not always be consistent with text reviews and purely relying on the rating score for finding positive/negative reviews may not be reliable. Automatic classification of user ratings based on textual review can create a more reliable rating for drugs. In this project, we built classification models to classify drug review ratings using textual reviews with traditional machine learning and deep learning models. Traditional machine learning models including Random Forest and Naive Bayesian classifiers were built using TF-IDF features as input. Also, transformer-based neural network models including BERT, Bio_ClinicalBERT, RoBERTa, XLNet, ELECTRA, and ALBERT were built using the raw text as input. Overall, Bio_ClinicalBERT model outperformed the other models with an overall accuracy of 87%. We further identified concepts of the Unified Medical Language System (UMLS) from the postings and analyzed their semantic types stratified by class types. This research demonstrated that transformer-based models can be used to classify drug reviews based solely on textual reviews.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744636/pdf/nihms-1855900.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10701370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信