{"title":"A New Metric for Measuring Locational Health Access for Cancer Treatment.","authors":"Subhajit Chakrabarty, Udaysinh Rathod, Sweta Singh, Debarshi Roy, Ismael Maya","doi":"10.1109/BIBM62325.2024.10822220","DOIUrl":"10.1109/BIBM62325.2024.10822220","url":null,"abstract":"<p><p>Ensuring access to cancer treatment facilities is essential for delivering timely care, yet various barriers such as geographic distance, socioeconomic factors, and social disparities can impede access in rural and urban regions. This study measured locational health access for colorectal cancer in the context of hospitals and population distribution in Louisiana. It used data of census tracts, hospital beds and providers, from the National Cancer Institute. By mapping the distribution of these healthcare facilities, the study revealed the potential of identifying significant challenges in accessing specialized cancer care. There is no existing locational health access metric in this domain. The contribution of this paper is that it meticulously calculated the actual road distance of each census tract centroid and each cancer-treating hospital, and offers a new locational health access metric. This metric considers the number of beds and number of oncologists, as a proxy for measurement of cancer treatment facilities. The significance of this work is that it can be applied in a larger scope (such as the country), with more variables, and for other diseases treated by hospitals. It has public policy implications; hospitals can be located through such data-driven analysis.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"6582-6588"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12241303/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144610560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting HIV Diagnosis Among Emerging Adults Using Electronic Health Records and Health Survey Data in All of Us Research Program.","authors":"Balu Bhasuran, Yiyang Liu, Mattia Prosperi, Karen MacDonell, Sylvie Naar, Zhe He","doi":"10.1109/bibm62325.2024.10822296","DOIUrl":"10.1109/bibm62325.2024.10822296","url":null,"abstract":"<p><p>The global decline in HIV incidence has not been mirrored in the United States, where young adults (ages 18-29) continue to account for a significant portion of new infections. In this study, we leverage the All of Us (AoU) Research Program's extensive electronic health records (EHRs) and health survey data to develop machine learning models capable of predicting HIV diagnoses at least three months before clinical identification. Among various models tested, the Support Vector Machine (SVM) model demonstrated a balanced performance, integrating clinically relevant features with robust predictive accuracy (AUC = 0.91). Risky drinking behaviors emerged as consistent top predictors across models, highlighting the importance of targeted interventions in this age group. Our findings underscore the potential of predictive analytics in enhancing HIV prevention strategies and informing public health efforts aimed at reducing HIV transmission among emerging adults.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"5433-5440"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823436/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143415967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arman Behnam, Muskan Garg, Xingyi Liu, Maria Vassilaki, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn
{"title":"Causal Explanation from Mild Cognitive Impairment Progression using Graph Neural Networks.","authors":"Arman Behnam, Muskan Garg, Xingyi Liu, Maria Vassilaki, Jennifer St Sauver, Ronald C Petersen, Sunghwan Sohn","doi":"10.1109/bibm62325.2024.10822848","DOIUrl":"10.1109/bibm62325.2024.10822848","url":null,"abstract":"<p><p>Mild Cognitive Impairment (MCI) is a transitional stage between normal cognitive aging and dementia. Some individuals with MCI revert to normal, while others progress to dementia. There are limited studies using explainable artificial intelligence on longitudinal data, particularly including genotypes, biomarkers and chronic diseases, to explore these differences. This study introduces a novel approach to understanding MCI progression using explainable graph neural networks. Utilizing longitudinal temporal data, we constructed a comprehensive graph representation of each individual in the study cohort. Our temporal graph convolutional network achieved 72.4% accuracy in predicting MCI transitions, while our causal explanation method outperformed existing explanation techniques in stability, accuracy, and faithfulness. We identified a causal subgraph with informative variables including hypertension, arrhythmia, congestive heart failure, coronary artery disease, stroke, lipid-related issues, and sex.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"6349-6355"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11803575/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143384106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpreting Lung Cancer Health Disparity between African American Males and European American Males.","authors":"Masrur Sobhan, Md Mezbahul Islam, Ananda Mohan Mondal","doi":"10.1109/bibm62325.2024.10822014","DOIUrl":"10.1109/bibm62325.2024.10822014","url":null,"abstract":"<p><p>Lung cancer remains a predominant cause of cancer-related deaths, with notable disparities in incidence and outcomes across racial and gender groups. This study addresses these disparities by developing a computational framework leveraging explainable artificial intelligence (XAI) to identify both patient- and cohort-specific biomarker genes in lung cancer. Specifically, we focus on two lung cancer subtypes, Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC), examining distinct racial and sex-specific cohorts: African American males (AAMs) and European American males (EAMs). This study innovatively structures classification tasks based on disease conditions rather than racial labels to avoid race-specific imbalance. We constructed four classification tasks- one three-class problem (LUAD-LUSC-HEALTHY) and three two-class problems (LUAD-LUSC, LUAD-HEALTHY, LUSC-HEALTHY)- to interpret the disease behavior of the patients in terms of genes and pathways. This methodology allows a LUAD or LUSC patient to be analyzed via multiple classifications, yielding robust disparity information for every patient. This preliminary work reports the disparity information for LUAD only. Utilizing Transcriptome data from The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) projects, we processed samples for LUAD, LUSC, and HEALTHY cohorts. We applied machine learning models, including convolutional neural network (CNN), logistic regression (LR), naïve Bayesian classifier (NB), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) for the classification. The SHapley Additive exPlanation (SHAP)-based interpretation of the best performing classification model uncovered cohort-specific genes and pathways related to health disparities between LUAD-AAM and LUAD-EAM cohorts.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"7141-7143"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Benchmarking Distance Functions in Siamese Networks for Current and Prior Mammogram Image Analysis.","authors":"Sahand Hamzehei, Afsana Ahsan Jeny, Annie Jin, Clifford Yang, Sheida Nabavi","doi":"10.1109/bibm62325.2024.10822291","DOIUrl":"10.1109/bibm62325.2024.10822291","url":null,"abstract":"<p><p>Mammogram image analysis has benefited from advancements in artificial intelligence (AI), particularly through the use of Siamese networks, which, similar to radiologists, compare current and prior mammogram images to enhance diagnostic accuracy. One of the main challenges in employing Siamese networks for this purpose is selecting an effective distance function. Given the complexity of mammogram images and the high correlation between current and prior images, traditional distance functions in Siamese networks often fall short in capturing the subtle, non-linear differences between these correlated features. This study explores the impact of incorporating non-linear and correlation-sensitive distance functions within a Siamese network framework for analyzing paired mammogram images. We benchmarked different distance functions, including Euclidean, Manhattan, Mahalanobis, Radial Basis Function (RBF), and cosine, and introduced a novel combination of RBF with Matern Covariance. Our evaluation revealed that the RBF with Matern Covariance consistently outperformed other functions, emphasizing the importance of addressing non-linearity and correlation in this context. For instance, the ResNet50 model, when paired with this distance function, achieved an accuracy of 0.938, sensitivity of 0.921, precision of 0.955, specificity of 0.958, F1 score of 0.930, and AUC of 0.940. We observed similarly strong performance across other models as well. Furthermore, the robustness of our approach was confirmed through evaluation on a dataset of 30 cross-validation samples, demonstrating its generalizability. These findings underscore the effectiveness of non-linear and correlation-based distance functions in Siamese networks for improving the performance and generalization of mammogram image analysis. All codes used in this paper are available at https://github.com/NabaviLab/Benchmarking_Distance_Functions_in_Siamese_Networks.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"1996-2003"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12250141/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144628015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context-Aware Contrastive Representation Learning for Zero-Shot Biomedical Text Classification.","authors":"Ratri Mukherjee, Kishlay Jha","doi":"10.1109/bibm62325.2024.10822585","DOIUrl":"10.1109/bibm62325.2024.10822585","url":null,"abstract":"<p><p>Biomedical text classification refers to the task of annotating a biomedical text with its relevant labels from a candidate label set. Most of the existing approach operate in a fully supervised setting and thus heavily rely on human-annotated training data which is both labor-intensive and monetarily expensive. To address this, we propose to formulate biomedical text classification under the zero-shot learning (ZSL) paradigm that does not require any labeled training data and only relies on label surface names for training and inference. Specifically, we propose a new context-aware contrastive learning technique for ZSL that fully exploits the context information present in the biomedical text to generate semantically enriched feature representations needed for accurate zero-shot biomedical text classification. Unlike existing contrastive learning approaches that typically employ random text segmentation strategies to generate contrastive pairs, our approach utilizes the context information inherently present in biomedical text to generate semantically meaningful contrastive pairs. Extensive experiments on the largest available biomedical corpus validates the effectiveness of the proposed approach.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"3611-3614"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11916847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143659972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Addressing Class Imbalance with Latent Diffusion-based Data Augmentation for Improving Disease Classification in Pediatric Chest X-rays.","authors":"Sivaramakrishnan Rajaraman, Zhaohui Liang, Zhiyun Xue, Sameer Antani","doi":"10.1109/bibm62325.2024.10822172","DOIUrl":"10.1109/bibm62325.2024.10822172","url":null,"abstract":"<p><p>Deep learning (DL) has transformed medical image classification; however, its efficacy is often limited by significant data imbalance due to far fewer cases (minority class) compared to controls (majority class). It has been shown that synthetic image augmentation techniques can simulate clinical variability, leading to enhanced model performance. We hypothesize that they could also mitigate the challenge of data imbalance, thereby addressing overfitting to the majority class and enhancing generalization. Recently, latent diffusion models (LDMs) have shown promise in synthesizing high-quality medical images. This study evaluates the effectiveness of a text-guided image-to-image LDM in synthesizing disease-positive chest X-rays (CXRs) and augmenting a pediatric CXR dataset to improve classification performance. We first establish baseline performance by fine-tuning an ImageNet-pretrained Inception-V3 model on class-imbalanced data for two tasks-normal vs. pneumonia and normal vs. bronchopneumonia. Next, we fine-tune individual text-guided image-to-image LDMs to generate CXRs showing signs of pneumonia and bronchopneumonia. The Inception-V3 model is retrained on an updated data set that includes these synthesized images as part of augmented training and validation sets. Classification performance is compared using balanced accuracy, sensitivity, specificity, F-score, Matthews correlation coefficient (MCC), Kappa, and Youden's index against the baseline performance. Results show that the augmentation significantly improves Youden's index (p<0.05) and markedly enhances other metrics, indicating that data augmentation using LDM-synthesized images is an effective strategy for addressing class imbalance in medical image classification.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2024 ","pages":"5059-5066"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11936509/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143712499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parsing Clinical Trial Eligibility Criteria for Cohort Query by a Multi-Input Multi-Output Sequence Labeling Model.","authors":"Shubo Tian, Pengfei Yin, Hansi Zhang, Arslan Erdengasileng, Jiang Bian, Zhe He","doi":"10.1109/bibm58861.2023.10385876","DOIUrl":"10.1109/bibm58861.2023.10385876","url":null,"abstract":"<p><p>To enable electronic screening of eligible patients for clinical trials, free-text clinical trial eligibility criteria should be translated to a computable format. Natural language processing (NLP) techniques have the potential to automate this process. In this study, we explored a supervised multi-input multi-output (MIMO) sequence labelling model to parse eligibility criteria into combinations of fact and condition tuples. Our experiments on a small manually annotated training dataset showed that that the performance of the MIMO framework with a BERT-based encoder using all the input sequences achieved an overall lenient-level AUROC of 0.61. Although the performance is suboptimal, representing eligibility criteria into logical and semantically clear tuples can potentially make subsequent translation of these tuples into database queries more reliable.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2023 ","pages":"4426-4430"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11251129/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyi Yang, Rich Gonzalez, Brahmajee K Nallamothu, Keith D Aaronson, Kevin R Ward, Alfred O Hero, Sardar Ansari
{"title":"A Practical Approach to Disease Risk Prediction: Focus on High-Risk Patients via Highest-<i>k</i> Loss.","authors":"Hongyi Yang, Rich Gonzalez, Brahmajee K Nallamothu, Keith D Aaronson, Kevin R Ward, Alfred O Hero, Sardar Ansari","doi":"10.1109/bibm58861.2023.10385816","DOIUrl":"10.1109/bibm58861.2023.10385816","url":null,"abstract":"<p><p>Disease risk prediction models play an important role in preventing disease developments in modern healthcare. However, the lack of focus on high-risk patients has hindered the large-scale practical application of these models, especially considering the limitation of medical resources available for following up on patients who are deemed high-risk. In this study, we propose a novel and practical approach that focuses on minimizing the number of false positive observations among high-risk patients by introducing the <i>Highest</i>-<i>k Loss</i>. The solution is to estimate the weights of the highest <math><mi>k</mi></math> scores with a differentiable estimation of the sorting operation and apply the weights to the loss function. We extracted 253,680 survey responses from a public dataset of the U.S. health survey system to define a diabetes prediction task. This study employs nested cross-validation as well as an aggregated model applied to an independent test set to systematically evaluate the proposed method. Compared with traditional binary cross entropy loss and Focal loss, the Highest- <math><mi>k</mi></math> loss improved the precision (positive predictive value) for the highest 1% scores by 0.05 (95% CI: 0.041-0.055), the highest 5% scores by 0.03 (95% CI: 0.024-0.032), and the highest 10% scores by 0.02 (95% CI: 0.016-0.021). The introduced Highest- <math><mi>k</mi></math> loss function addresses the problem of prevailing risk prediction models and offers a practical solution that focuses on patients with the <math><mi>k</mi></math> highest predictive scores who can realistically receive an intervention as opposed to the entire patient population.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2023 ","pages":"3226-3233"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11821551/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143415935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building Prediction Models for 30-Day Readmissions Among ICU Patients Using Both Structured and Unstructured Data in Electronic Health Records.","authors":"Alex Moerschbacher, Zhe He","doi":"10.1109/bibm58861.2023.10385612","DOIUrl":"10.1109/bibm58861.2023.10385612","url":null,"abstract":"<p><p>ICU readmissions are associated with poor outcomes for patients and poor performance of hospitals. Patients who are readmitted have an increased risk of in-hospital deaths; hospitals with a higher read-mission rate have a reduced profitability, due to an increase in cost and reduced payments from Medicare and Medicaid programs. Predicting a patient's likelihood of being readmitted to the ICU can help reduce early discharges, the risk of in-hospital deaths, and help in-crease profitability. In this study, we built and evaluated multiple machine learning models to predict 30-day readmission rates of ICU patients in the MIMIC-III database. We used both the structured data including demographics, laboratory tests, comorbidities, and unstructured discharge summaries as the predictors and evaluated different combinations of features. The best performing model in this study Logistic Regression achieved an AUROC of 75.7%. This study shows the potential of leveraging machine learning and deep learning for predicting ICU readmissions.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2023 ","pages":"4368-4373"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11271049/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141763104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}