{"title":"Sharing Time-to-Event Data with Privacy Protection.","authors":"Luca Bonomi, Liyue Fan","doi":"10.1109/ichi54592.2022.00014","DOIUrl":"10.1109/ichi54592.2022.00014","url":null,"abstract":"<p><p>Sharing time-to-event data is beneficial for enabling collaborative research efforts (e.g., survival studies), facilitating the design of effective interventions, and advancing patient care (e.g., early diagnosis). Despite numerous privacy solutions for sharing time-to-event data, recent research studies have shown that external information may become available (e.g., self-disclosure of study participation on social media) to an adversary, posing new privacy concerns. In this work, we formulate a cohort inference attack for time-to-event data sharing, in which an informed adversary aims at inferring the membership of a target individual in a specific cohort. Our study investigates the privacy risks associated with time-to-event data and evaluates the empirical privacy protection offered by popular privacy-protecting solutions (e.g., binning, differential privacy). Furthermore, we propose a novel approach to privately release individual level time-to-event data with high utility, while providing indistinguishability guarantees for the input value. Our method TE-Sanitizer is shown to provide effective mitigation against the inference attacks and high usefulness in survival analysis. The results and discussion provide domain experts with insights on the privacy and the usefulness of the studied methods.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9473343/pdf/nihms-1815589.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10181249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omar A Ibrahim, Sunyang Fu, Maria Vassilaki, Michelle M Mielke, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn
{"title":"Detection of Dementia Signals from Longitudinal Clinical Visits Using One-Class Classification.","authors":"Omar A Ibrahim, Sunyang Fu, Maria Vassilaki, Michelle M Mielke, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn","doi":"10.1109/ichi54592.2022.00040","DOIUrl":"10.1109/ichi54592.2022.00040","url":null,"abstract":"<p><p>Dementia is one of the major health challenges in aging populations, with 50 million people diagnosed worldwide. However, dementia is often underdiagnosed or delayed resulting in missed opportunities for appropriate care plans. Identifying early signs of dementia is essential for better life quality of aging populations. Monitoring early signs of individual health changes could help clinicians diagnose dementia in its early stages with more effective treatment plans. However, rare data for dementia cases compared to the normal (i.e., imbalance class distribution) make it challenging to develop robust supervised learning models. In order to alleviate this issue, we investigated one-class classification (OCC) techniques, which use only majority class (i.e., normal cases) in model development to detect dementia signals from older adult clinical visits. The OCC models identify abnormality of older adults' longitudinal health conditions to predict incident dementia. The predictive performance of the OCC was compared with a recent streaming clustering-based technique and demonstrated higher predictive power. Our analysis showed that OCC has a promising potential to increase power in predicting dementia.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728104/pdf/nihms-1852693.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9328507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yao Ge, Yuting Guo, Yuan-Chi Yang, Mohammed Ali Al-Garadi, Abeed Sarker
{"title":"A comparison of few-shot and traditional named entity recognition models for medical text.","authors":"Yao Ge, Yuting Guo, Yuan-Chi Yang, Mohammed Ali Al-Garadi, Abeed Sarker","doi":"10.1109/ichi54592.2022.00024","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00024","url":null,"abstract":"<p><p>Many research problems involving medical texts have limited amounts of annotated data available (<i>e.g</i>., expressions of rare diseases). Traditional supervised machine learning algorithms, particularly those based on deep neural networks, require large volumes of annotated data, and they underperform when only small amounts of labeled data are available. Few-shot learning (FSL) is a category of machine learning models that are designed with the intent of solving problems that have small annotated datasets available. However, there is no current study that compares the performances of FSL models with traditional models (<i>e.g</i>., conditional random fields) for medical text at different training set sizes. In this paper, we attempted to fill this gap in research by comparing multiple FSL models with traditional models for the task of named entity recognition (NER) from medical texts. Using five health-related annotated NER datasets, we benchmarked three traditional NER models based on BERT-BERT-Linear Classifier (BLC), BERT-CRF (BC) and SANER; and three FSL NER models-StructShot & NNShot, Few-Shot Slot Tagging (FS-ST) and ProtoNER. Our benchmarking results show that almost all models, whether traditional or FSL, achieve significantly lower performances compared to the state-of-the-art with small amounts of training data. For the NER experiments we executed, the F<sub>1</sub>-scores were very low with small training sets, typically below 30%. FSL models that were reported to perform well on non-medical texts significantly underperformed, compared to their reported best, on medical texts. Our experiments also suggest that FSL methods tend to perform worse on data sets from noisy sources of medical texts, such as social media (which includes misspellings and colloquial expressions), compared to less noisy sources such as medical literature. Our experiments demonstrate that the current state-of-the-art FSL systems are not yet suitable for effective NER in medical natural language processing tasks, and further research needs to be carried out to improve their performances. Creation of specialized, standardized datasets replicating real-world scenarios may help to move this category of methods forward.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10462421/pdf/nihms-1926966.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10186790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Annotating Music Therapy, Chiropractic and Aquatic Exercise Using Electronic Health Record.","authors":"Huixue Zhou, Greg Silverman, Zhongran Niu, Jenzi Silverman, Roni Evans, Robin Austin, Rui Zhang","doi":"10.1109/ichi54592.2022.00121","DOIUrl":"10.1109/ichi54592.2022.00121","url":null,"abstract":"<p><p>Complementary and Integrative Health (CIH) has gained increasing popularity in the past decades. The overall goal of this study is to represent information pertinent to music therapy, chiropractic and aquatic exercise in an EHR system. A total of 300 clinical notes were randomly selected and manually annotated. Annotations were made for <i>status</i>, <i>symptom</i> and <i>frequency</i> of each approach. This set of annotations was used as a gold standard to evaluate performance of NLP systems used in this study (specifically BioMedICUS, MetaMap and cTAKES) for extracting CIH concepts. Three NLP systems achieved an average lenient match F1-score of 0.50 in all three CIH approaches. BioMedICUS achieved the best performance in music therapy with an F1-score of 0.73. This study is a pilot to investigate CIH representation in clinical note and lays a foundation for using EHR for clinical research for CIH approaches.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10110363/pdf/nihms-1890434.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9751841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models.","authors":"Akhil Shiju, Zhe He","doi":"10.1109/ichi54592.2022.00035","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00035","url":null,"abstract":"<p><p>Drug review websites such as Drugs.com provide users' textual reviews and numeric ratings of drugs. These reviews along with the ratings are used for the consumers for choosing a drug. However, the numeric ratings may not always be consistent with text reviews and purely relying on the rating score for finding positive/negative reviews may not be reliable. Automatic classification of user ratings based on textual review can create a more reliable rating for drugs. In this project, we built classification models to classify drug review ratings using textual reviews with traditional machine learning and deep learning models. Traditional machine learning models including Random Forest and Naive Bayesian classifiers were built using TF-IDF features as input. Also, transformer-based neural network models including BERT, Bio_ClinicalBERT, RoBERTa, XLNet, ELECTRA, and ALBERT were built using the raw text as input. Overall, Bio_ClinicalBERT model outperformed the other models with an overall accuracy of 87%. We further identified concepts of the Unified Medical Language System (UMLS) from the postings and analyzed their semantic types stratified by class types. This research demonstrated that transformer-based models can be used to classify drug reviews based solely on textual reviews.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9744636/pdf/nihms-1855900.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10701370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan Peng
{"title":"Radiology Text Analysis System (RadText): Architecture and Evaluation.","authors":"Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan Peng","doi":"10.1109/ichi54592.2022.00050","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00050","url":null,"abstract":"<p><p>Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis. In this work, we present RadText, a high-performance open-source Python radiology text analysis system. RadText offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization, named entity recognition, parsing, and negation detection. Superior to existing widely used toolkits, RadText features a hybrid text processing schema, supports raw text processing and local processing, which enables higher accuracy, better usability and improved data privacy. RadText adopts BioC as the unified interface, and also standardizes the output into a structured representation that is compatible with Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which allows for a more systematic approach to observational research across multiple, disparate data sources. We evaluated RadText on the MIMIC-CXR dataset, with five new disease labels that we annotated for this work. RadText demonstrates highly accurate classification performances, with a 0.91 average precision, 0.94 average recall and 0.92 average F-1 score. We also annotated a test set for the five new disease labels to facilitate future research or applications. We have made our code, documentations, examples and the test set available at https://github.com/bionlplab/radtext.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484781/pdf/nihms-1836549.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40373631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianping He, Fang Li, Xinyue Hu, Jianfu Li, Yi Nian, Jingqi Wang, Yang Xiang, Qiang Wei, Hua Xu, Cui Tao
{"title":"Chemical-Protein Relation Extraction with Pre-trained Prompt Tuning.","authors":"Jianping He, Fang Li, Xinyue Hu, Jianfu Li, Yi Nian, Jingqi Wang, Yang Xiang, Qiang Wei, Hua Xu, Cui Tao","doi":"10.1109/ichi54592.2022.00120","DOIUrl":"https://doi.org/10.1109/ichi54592.2022.00120","url":null,"abstract":"<p><p>Biomedical relation extraction plays a critical role in the construction of high-quality knowledge graphs and databases, which can further support many downstream applications. Pre-trained prompt tuning, as a new paradigm, has shown great potential in many natural language processing (NLP) tasks. Through inserting a piece of text into the original input, prompt converts NLP tasks into masked language problems, which could be better addressed by pre-trained language models (PLMs). In this study, we applied pre-trained prompt tuning to chemical-protein relation extraction using the BioCreative VI CHEMPROT dataset. The experiment results showed that the pre-trained prompt tuning outperformed the baseline approach in chemical-protein interaction classification. We conclude that the prompt tuning can improve the efficiency of the PLMs on chemical-protein relation extraction tasks.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10474649/pdf/nihms-1887657.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10514652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuang Yang, Xi Yang, Tianchen Lyu, Xing He, Dejana Braithwaite, Hiren J Mehta, Yi Guo, Yonghui Wu, Jiang Bian
{"title":"A Preliminary Study of Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports Using Natural Language Processing.","authors":"Shuang Yang, Xi Yang, Tianchen Lyu, Xing He, Dejana Braithwaite, Hiren J Mehta, Yi Guo, Yonghui Wu, Jiang Bian","doi":"10.1109/ichi54592.2022.00125","DOIUrl":"10.1109/ichi54592.2022.00125","url":null,"abstract":"<p><p>This study aims to develop a natural language processing (NLP) tool to extract the pulmonary nodules and nodule characteristics information from free-text clinical narratives. We identified a cohort of 3,080 patients who received low dose computed tomography (LDCT) at the University of Florida health system and collected their clinical narratives including radiology reports in their electronic health records (EHRs). Then, we manually annotated 394 reports as the gold-standard corpus and explored three state-of-the-art transformer-based NLP methods. The best model achieved an F1-score of 0.9279.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9511964/pdf/nihms-1836669.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9655481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speech Recognition Technologies Based on Artificial Intelligence Algorithms","authors":"M. Musaev, I. Khujayarov, M. Ochilov","doi":"10.1007/978-3-031-27199-1_6","DOIUrl":"https://doi.org/10.1007/978-3-031-27199-1_6","url":null,"abstract":"","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77351466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Co-creating Computer Supported Collective Intelligence in Citizen Science Hubs","authors":"Aelita Skaržauskienė, M. Maciuliene","doi":"10.1007/978-3-031-27199-1_43","DOIUrl":"https://doi.org/10.1007/978-3-031-27199-1_43","url":null,"abstract":"","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79758404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}