Jinge Wu, Hang Dong, Zexi Li, Haowei Wang, Runci Li, Arijit Patra, Chengliang Dai, Waqar Ali, Phil Scordis, Honghan Wu
{"title":"A hybrid framework with large language models for rare disease phenotyping.","authors":"Jinge Wu, Hang Dong, Zexi Li, Haowei Wang, Runci Li, Arijit Patra, Chengliang Dai, Waqar Ali, Phil Scordis, Honghan Wu","doi":"10.1186/s12911-024-02698-7","DOIUrl":"https://doi.org/10.1186/s12911-024-02698-7","url":null,"abstract":"<p><strong>Purpose: </strong>Rare diseases pose significant challenges in diagnosis and treatment due to their low prevalence and heterogeneous clinical presentations. Unstructured clinical notes contain valuable information for identifying rare diseases, but manual curation is time-consuming and prone to subjectivity. This study aims to develop a hybrid approach combining dictionary-based natural language processing (NLP) tools with large language models (LLMs) to improve rare disease identification from unstructured clinical reports.</p><p><strong>Methods: </strong>We propose a novel hybrid framework that integrates the Orphanet Rare Disease Ontology (ORDO) and the Unified Medical Language System (UMLS) to create a comprehensive rare disease vocabulary. SemEHR, a dictionary-based NLP tool, is employed to extract rare disease mentions from clinical notes. To refine the results and improve accuracy, we leverage various LLMs, including LLaMA3, Phi3-mini, and domain-specific models like OpenBioLLM and BioMistral. Different prompting strategies, such as zero-shot, few-shot, and knowledge-augmented generation, are explored to optimize the LLMs' performance.</p><p><strong>Results: </strong>The proposed hybrid approach demonstrates superior performance compared to traditional NLP systems and standalone LLMs. LLaMA3 and Phi3-mini achieve the highest F1 scores in rare disease identification. Few-shot prompting with 1-3 examples yields the best results, while knowledge-augmented generation shows limited improvement. Notably, the approach uncovers a significant number of potential rare disease cases not documented in structured diagnostic records, highlighting its ability to identify previously unrecognized patients.</p><p><strong>Conclusion: </strong>The hybrid approach combining dictionary-based NLP tools with LLMs shows great promise for improving rare disease identification from unstructured clinical reports. By leveraging the strengths of both techniques, the method demonstrates superior performance and the potential to uncover hidden rare disease cases. Further research is needed to address limitations related to ontology mapping and overlapping case identification, and to integrate the approach into clinical practice for early diagnosis and improved patient outcomes.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Designing and evaluating a mobile app to assist patients undergoing coronary angiography and assessing its impact on anxiety, stress levels, and self-care.","authors":"Milad Safaei, Amin Mahdavi, Roghayeh Mehdipour-Rabori","doi":"10.1186/s12911-024-02703-z","DOIUrl":"10.1186/s12911-024-02703-z","url":null,"abstract":"<p><strong>Background: </strong>Coronary artery disease is one of the leading causes of death and disability worldwide. Coronary angiography is a diagnostic procedure used to detect atherosclerosis. Patients typically experience anxiety and stress before and during the angiography procedure. Furthermore, self-care ability is crucial following angiography.</p><p><strong>Aim: </strong>This study aims to describe the design and evaluation of a mobile application focusing on stress, anxiety, and self-care abilities in patients undergoing coronary angiography.</p><p><strong>Method: </strong>The researchers developed a mobile application for patients undergoing angiography. The application provides information about angiography and tips for enhancing self-care following the procedure. An interventional study was conducted on 70 patients admitted to the angiography ward in hospitals in Kerman, Iran, between 2022 and 2023. The participants were randomly divided into two groups: control and intervention. The interventional group received the intervention application the night before angiography. Two groups completed the Anxiety and Stress Questionnaire (DAS) and Kearney-Flescher Self-Care Survey before the intervention. The researchers used questionnaires that had been prepared and previously utilized in other studies. The two groups completed the anxiety and stress questionnaire within three to six hours and the self-care questionnaire one month after angiography. SPSS 15 software was used for data analysis, with a significance level set at 0.05.</p><p><strong>Results: </strong>The study found that the majority of participants were women. Before the study, there was no significant difference between the two groups in terms of anxiety, stress, and self-care scores. However, after the study, the intervention group showed a significant decrease in average anxiety and stress scores (p < 0.001). Additionally, compared to the control group, the intervention group demonstrated significant improvement in average self-care score (p < 0.001).</p><p><strong>Conclusion: </strong>According to this study, AP can be effective in influencing the anxiety, stress levels, and self-care ability of patients who undergo coronary angiography. It can help to reduce stress and anxiety while increasing self-care. Instructive software is user-friendly, cost-effective, and can be recommended by nurses and doctors.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460224/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Vafaee-Najar, Elaheh Hooshmand, Arefeh Pourtaleb, Hasan Ramezani Chenar
{"title":"New experience of implementing patient e-referral in the Iranian health system: a qualitative study.","authors":"Ali Vafaee-Najar, Elaheh Hooshmand, Arefeh Pourtaleb, Hasan Ramezani Chenar","doi":"10.1186/s12911-024-02706-w","DOIUrl":"10.1186/s12911-024-02706-w","url":null,"abstract":"<p><strong>Background: </strong>Implementing an electronic system of service categorization and a referral system in healthcare is a strategic approach to improving overall health outcomes and optimizing resource use. This study aimed to investigate challenges experienced with the electronic patient referral system in Mashhad University of Medical Sciences (MUMS).</p><p><strong>Methods: </strong>In this qualitative research, data were collected using semi-structured interviews. Participants included physicians, experts, and stakeholders working in the Family Physician Program and the referral system, selected through purposive sampling. The data were analyzed using a thematic analysis framework, in which a thematic framework was developed, and key themes were identified. Data analysis was performed using Atlas.ti8 software.</p><p><strong>Results: </strong>According to the interviewees, the challenges of digitizing the referral system can be categorized into three main themes: structure, process, and outcomes. These themes include ten sub-themes, such as challenges related to Internet Infrastructure and the Sina System, Patients' Choice of Desired Specialists, Receiving Payment for Services, Appointment Scheduling, Interdepartmental Coordination, Recording Definitive Diagnosis Codes Before Referral, False Referrals, Dissatisfaction, Feedbacks, and Health Indicators.</p><p><strong>Conclusion: </strong>To improve the e-referral in Iran's health system, several strategies can be implemented. These include sustainable resource allocation, designing consequence mechanisms within the referral system to motivate collaboration and improving appointment scheduling systems. Furthermore, addressing these challenges requires a collaborative approach involving healthcare providers, IT professionals, and patient representatives to ensure that the system is efficient, user-friendly, and effectively meets the needs of all parties involved. Not paying enough attention to these issues cause reform failure while solving them requires multi-dimensional, systematic and coordinated interventions with a deep understanding of the obstacles and challenges. Disregarding these factors may result in apathy over time, ultimately impacting both the quantity and, more importantly, the quality of services.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development and validation of an instrument to evaluate the perspective of using the electronic health record in a hospital setting.","authors":"Radouane Rhayha, Abderrahman Alaoui Ismaili","doi":"10.1186/s12911-024-02675-0","DOIUrl":"10.1186/s12911-024-02675-0","url":null,"abstract":"<p><strong>Background: </strong>Evaluating healthcare information systems, such as the Electronic Health Records (EHR), is both challenging and essential, especially in resource-limited countries. This study aims to psychometrically develop and validate an instrument (questionnaire) to assess the factors influencing the successful adoption of the EHR system by healthcare professionals in Moroccan university hospitals.</p><p><strong>Methods: </strong>The questionnaire validation process occurred in two main stages. Initially, data collected from a pilot sample of 164 participants underwent analysis using exploratory factor analysis (EFA) to evaluate the validity and reliability of the retained factor structure. Subsequently, the validity of the overall measurement model was confirmed using confirmatory factor analysis (CFA) in a sample of 368 healthcare professionals.</p><p><strong>Results: </strong>The structure of the modified HOT-fit model, comprising seven constructs (System Quality, Information Quality, Information technology Service Quality, User Satisfaction, Organization, Environment, and Clinical Performance), was confirmed through confirmatory factor analysis. Absolute, incremental, and parsimonious fit indices all indicated an appropriate level of acceptability, affirming the robustness of the measurement model. Additionally, the instrument demonstrated adequate reliability and convergent validity, with composite reliability values ranging from 0.75 to 0.89 and average variance extracted (AVE) values ranging from 0.51 to 0.63. Furthermore, the square roots of AVE values exceeded the correlations between different pairs of constructs, and the heterotrait-monotrait ratio of correlations (HTMT) was below 0.85, confirming suitable discriminant validity.</p><p><strong>Conclusions: </strong>The resulting instrument, due to its rigorous development and validation process, can serve as a reliable and valid tool for assessing the success of information technologies in similar contexts.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neel Kanwal, Farbod Khoraminia, Umay Kiraz, Andrés Mosquera-Zamudio, Carlos Monteagudo, Emiel A M Janssen, Tahlita C M Zuiverloon, Chunming Rong, Kjersti Engan
{"title":"Equipping computational pathology systems with artifact processing pipelines: a showcase for computation and performance trade-offs.","authors":"Neel Kanwal, Farbod Khoraminia, Umay Kiraz, Andrés Mosquera-Zamudio, Carlos Monteagudo, Emiel A M Janssen, Tahlita C M Zuiverloon, Chunming Rong, Kjersti Engan","doi":"10.1186/s12911-024-02676-z","DOIUrl":"https://doi.org/10.1186/s12911-024-02676-z","url":null,"abstract":"<p><strong>Background: </strong>Histopathology is a gold standard for cancer diagnosis. It involves extracting tissue specimens from suspicious areas to prepare a glass slide for a microscopic examination. However, histological tissue processing procedures result in the introduction of artifacts, which are ultimately transferred to the digitized version of glass slides, known as whole slide images (WSIs). Artifacts are diagnostically irrelevant areas and may result in wrong predictions from deep learning (DL) algorithms. Therefore, detecting and excluding artifacts in the computational pathology (CPATH) system is essential for reliable automated diagnosis.</p><p><strong>Methods: </strong>In this paper, we propose a mixture of experts (MoE) scheme for detecting five notable artifacts, including damaged tissue, blur, folded tissue, air bubbles, and histologically irrelevant blood from WSIs. First, we train independent binary DL models as experts to capture particular artifact morphology. Then, we ensemble their predictions using a fusion mechanism. We apply probabilistic thresholding over the final probability distribution to improve the sensitivity of the MoE. We developed four DL pipelines to evaluate computational and performance trade-offs. These include two MoEs and two multiclass models of state-of-the-art deep convolutional neural networks (DCNNs) and vision transformers (ViTs). These DL pipelines are quantitatively and qualitatively evaluated on external and out-of-distribution (OoD) data to assess generalizability and robustness for artifact detection application.</p><p><strong>Results: </strong>We extensively evaluated the proposed MoE and multiclass models. DCNNs-based MoE and ViTs-based MoE schemes outperformed simpler multiclass models and were tested on datasets from different hospitals and cancer types, where MoE using (MobileNet) DCNNs yielded the best results. The proposed MoE yields 86.15 % F1 and 97.93% sensitivity scores on unseen data, retaining less computational cost for inference than MoE using ViTs. This best performance of MoEs comes with relatively higher computational trade-offs than multiclass models. Furthermore, we apply post-processing to create an artifact segmentation mask, a potential artifact-free RoI map, a quality report, and an artifact-refined WSI for further computational analysis. During the qualitative evaluation, field experts assessed the predictive performance of MoEs over OoD WSIs. They rated artifact detection and artifact-free area preservation, where the highest agreement translated to a Cohen Kappa of 0.82, indicating substantial agreement for the overall diagnostic usability of the DCNN-based MoE scheme.</p><p><strong>Conclusions: </strong>The proposed artifact detection pipeline will not only ensure reliable CPATH predictions but may also provide quality control. In this work, the best-performing pipeline for artifact detection is MoE with DCNNs. Our detailed experiments show that there is always","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11457387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M M Enes Yurtsever, Yilmaz Atay, Bilgehan Arslan, Seref Sagiroglu
{"title":"Development of brain tumor radiogenomic classification using GAN-based augmentation of MRI slices in the newly released gazi brains dataset.","authors":"M M Enes Yurtsever, Yilmaz Atay, Bilgehan Arslan, Seref Sagiroglu","doi":"10.1186/s12911-024-02699-6","DOIUrl":"10.1186/s12911-024-02699-6","url":null,"abstract":"<p><p>Significant progress has been made recently with the contribution of technological advances in studies on brain cancer. Regarding this, identifying and correctly classifying tumors is a crucial task in the field of medical imaging. The disease-related tumor classification problem, on which deep learning technologies have also become a focus, is very important in the diagnosis and treatment of the disease. The use of deep learning models has shown promising results in recent years. However, the sparsity of ground truth data in medical imaging or inconsistent data sources poses a significant challenge for training these models. The utilization of StyleGANv2-ADA is proposed in this paper for augmenting brain MRI slices to enhance the performance of deep learning models. Specifically, augmentation is applied solely to the training data to prevent any potential leakage. The StyleGanv2-ADA model is trained with the Gazi Brains 2020, BRaTS 2021, and Br35h datasets using the researchers' default settings. The effectiveness of the proposed method is demonstrated on datasets for brain tumor classification, resulting in a notable improvement in the overall accuracy of the model for brain tumor classification on all the Gazi Brains 2020, BraTS 2021, and Br35h datasets. Importantly, the utilization of StyleGANv2-ADA on the Gazi Brains 2020 Dataset represents a novel experiment in the literature. The results show that the augmentation with StyleGAN can help overcome the challenges of working with medical data and the sparsity of ground truth data. Data augmentation employing the StyleGANv2-ADA GAN model yielded the highest overall accuracy for brain tumor classification on the BraTS 2021 and Gazi Brains 2020 datasets, together with the BR35H dataset, achieving 75.18%, 99.36%, and 98.99% on the EfficientNetV2S models, respectively. This study emphasizes the potency of GANs for augmenting medical imaging datasets, particularly in brain tumor classification, showcasing a notable increase in overall accuracy through the integration of synthetic GAN data on the used datasets.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11450983/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wantao Zhang, Yan Zhu, Liqun Tong, Guo Wei, Huajun Zhang
{"title":"Leverage machine learning to identify key measures in hospital operations management: a retrospective study to explore feasibility and performance of four common algorithms.","authors":"Wantao Zhang, Yan Zhu, Liqun Tong, Guo Wei, Huajun Zhang","doi":"10.1186/s12911-024-02689-8","DOIUrl":"10.1186/s12911-024-02689-8","url":null,"abstract":"<p><strong>Background: </strong>Measures in operations management are pivotal for monitoring and assessing various aspects of hospital performance. Existing literature highlights the importance of regularly updating key management measures to reflect changing trends and organizational goals. Advancements in machine learning (ML) have presented promising opportunities for enhancing the process of updating operations management measures. However, their specific application and performance remain relatively unexplored. We aimed to investigate the feasibility and effectiveness of using common ML techniques to identify and update key measures in hospital operations management.</p><p><strong>Methods: </strong>Historical data on 43 measures on financial balance and quality of care under 4 categories were retrieved from the BI system of a regional health system in Central China. The dataset included 17 surgical and 15 non-surgical departments over 48 months. Four common ML techniques, linear models (LM), random forest (RF), partial least squares (PLS), and neural networks (NN), were used to identify the most important measures. Ordinary least square was employed to investigate the impact of the top 10 measures. A ground truth validation compared the ML-identified key measures against the humanly decided strategic measures from annual meeting minutes.</p><p><strong>Results: </strong>For financial balancing, inpatient treatment revenue was an important measure in 3/4 years, followed by equipment depreciation costs. The measures identified using the same technique differed between years, though RF and PLS yielded relatively consistent results. For quality of care, none of the ML-identified measures repeated over the years. Those consistently important over four years differed almost entirely among four techniques. On ground truth validation, the 2016-2019 ML-identified measures were among the humanly identified measures, with the exception of equipment depreciation from the 2019 dataset. All the ML-identified measures for quality of care failed to coincide with the humanly decided measures.</p><p><strong>Conclusions: </strong>Using ML to identify key hospital operational measures is viable but performance of ML techniques vary considerably. RF performs best among the four techniques in identifying key measures in financial balance. None of the ML techniques seem effective for identifying quality of care measures. ML is suggested as a decision support tool to remind and inspire decision-makers in certain aspects of hospital operations management.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11451234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine learning-based prediction model for hypofibrinogenemia after tigecycline therapy.","authors":"Jianping Zhu, Rui Zhao, Zhenwei Yu, Liucheng Li, Jiayue Wei, Yan Guan","doi":"10.1186/s12911-024-02694-x","DOIUrl":"10.1186/s12911-024-02694-x","url":null,"abstract":"<p><strong>Background: </strong>In clinical practice, the incidence of hypofibrinogenemia (HF) after tigecycline (TGC) treatment significantly exceeds the probability claimed by drug manufacturers.</p><p><strong>Objective: </strong>We aimed to identify the risk factors for TGC-associated HF and develop prediction and survival models for TGC-associated HF and the timing of TGC-associated HF.</p><p><strong>Methods: </strong>This single-center retrospective cohort study included 222 patients who were prescribed TGC. First, we used binary logistic regression to screen the independent factors influencing TGC-associated HF, which were used as predictors to train the extreme gradient boosting (XGBoost) model. Receiver operating characteristic curve (ROC), calibration curve, decision curve analysis (DCA), and clinical impact curve analysis (CICA) were used to evaluate the performance of the model in the verification cohort. Subsequently, we conducted survival analysis using the random survival forest (RSF) algorithm. A consistency index (C-index) was used to evaluate the accuracy of the RSF model in the verification cohort.</p><p><strong>Results: </strong>Binary logistic regression identified nine independent factors influencing TGC-associated HF, and the XGBoost model was constructed using these nine predictors. The ROC and calibration curves showed that the model had good discrimination (areas under the ROC curves (AUC) = 0.792 [95% confidence interval (CI), 0.668-0.915]) and calibration ability. In addition, DCA and CICA demonstrated good clinical practicability of this model. Notably, the RSF model showed good accuracy (C-index = 0.746 [95%CI, 0.652-0.820]) in the verification cohort. Stratifying patients treated with TGC based on the RSF model revealed a statistically significant difference in the mean survival time between the low- and high-risk groups.</p><p><strong>Conclusions: </strong>The XGBoost model effectively predicts the risk of TGC-associated HF, whereas the RSF model has advantages in risk stratification. These two models have significant clinical practical value, with the potential to reduce the risk of TGC therapy.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11451173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ghada Mostafa, Hamdi Mahmoud, Tarek Abd El-Hafeez, Mohamed E ElAraby
{"title":"The power of deep learning in simplifying feature selection for hepatocellular carcinoma: a review.","authors":"Ghada Mostafa, Hamdi Mahmoud, Tarek Abd El-Hafeez, Mohamed E ElAraby","doi":"10.1186/s12911-024-02682-1","DOIUrl":"10.1186/s12911-024-02682-1","url":null,"abstract":"<p><strong>Background: </strong>Hepatocellular Carcinoma (HCC) is a highly aggressive, prevalent, and deadly type of liver cancer. With the advent of deep learning techniques, significant advancements have been made in simplifying and optimizing the feature selection process.</p><p><strong>Objective: </strong>Our scoping review presents an overview of the various deep learning models and algorithms utilized to address feature selection for HCC. The paper highlights the strengths and limitations of each approach, along with their potential applications in clinical practice. Additionally, it discusses the benefits of using deep learning to identify relevant features and their impact on the accuracy and efficiency of diagnosis, prognosis, and treatment of HCC.</p><p><strong>Design: </strong>The review encompasses a comprehensive analysis of the research conducted in the past few years, focusing on the methodologies, datasets, and evaluation metrics adopted by different studies. The paper aims to identify the key trends and advancements in the field, shedding light on the promising areas for future research and development.</p><p><strong>Results: </strong>The findings of this review indicate that deep learning techniques have shown promising results in simplifying feature selection for HCC. By leveraging large-scale datasets and advanced neural network architectures, these methods have demonstrated improved accuracy and robustness in identifying predictive features.</p><p><strong>Conclusions: </strong>We analyze published studies to reveal the state-of-the-art HCC prediction and showcase how deep learning can boost accuracy and decrease false positives. But we also acknowledge the challenges that remain in translating this potential into clinical reality.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11452940/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ken Cheligeer, Guosong Wu, Alison Laws, May Lynn Quan, Andrea Li, Anne-Marie Brisson, Jason Xie, Yuan Xu
{"title":"Validation of large language models for detecting pathologic complete response in breast cancer using population-based pathology reports.","authors":"Ken Cheligeer, Guosong Wu, Alison Laws, May Lynn Quan, Andrea Li, Anne-Marie Brisson, Jason Xie, Yuan Xu","doi":"10.1186/s12911-024-02677-y","DOIUrl":"10.1186/s12911-024-02677-y","url":null,"abstract":"<p><strong>Aims: </strong>The primary goal of this study is to evaluate the capabilities of Large Language Models (LLMs) in understanding and processing complex medical documentation. We chose to focus on the identification of pathologic complete response (pCR) in narrative pathology reports. This approach aims to contribute to the advancement of comprehensive reporting, health research, and public health surveillance, thereby enhancing patient care and breast cancer management strategies.</p><p><strong>Methods: </strong>The study utilized two analytical pipelines, developed with open-source LLMs within the healthcare system's computing environment. First, we extracted embeddings from pathology reports using 15 different transformer-based models and then employed logistic regression on these embeddings to classify the presence or absence of pCR. Secondly, we fine-tuned the Generative Pre-trained Transformer-2 (GPT-2) model by attaching a simple feed-forward neural network (FFNN) layer to improve the detection performance of pCR from pathology reports.</p><p><strong>Results: </strong>In a cohort of 351 female breast cancer patients who underwent neoadjuvant chemotherapy (NAC) and subsequent surgery between 2010 and 2017 in Calgary, the optimized method displayed a sensitivity of 95.3% (95%CI: 84.0-100.0%), a positive predictive value of 90.9% (95%CI: 76.5-100.0%), and an F1 score of 93.0% (95%CI: 83.7-100.0%). The results, achieved through diverse LLM integration, surpassed traditional machine learning models, underscoring the potential of LLMs in clinical pathology information extraction.</p><p><strong>Conclusions: </strong>The study successfully demonstrates the efficacy of LLMs in interpreting and processing digital pathology data, particularly for determining pCR in breast cancer patients post-NAC. The superior performance of LLM-based pipelines over traditional models highlights their significant potential in extracting and analyzing key clinical data from narrative reports. While promising, these findings highlight the need for future external validation to confirm the reliability and broader applicability of these methods.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11447988/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}