International Journal of Medical Informatics最新文献

筛选
英文 中文
Machine learning in healthcare citizen science: A scoping review 医疗保健公民科学中的机器学习:范围审查。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-19 DOI: 10.1016/j.ijmedinf.2024.105766
Ranga Baminiwatte , Blessing Torsu , Dmitry Scherbakov , Abolfazl Mollalo, Jihad S. Obeid, Alexander V. Alekseyenko, Leslie A. Lenert
{"title":"Machine learning in healthcare citizen science: A scoping review","authors":"Ranga Baminiwatte ,&nbsp;Blessing Torsu ,&nbsp;Dmitry Scherbakov ,&nbsp;Abolfazl Mollalo,&nbsp;Jihad S. Obeid,&nbsp;Alexander V. Alekseyenko,&nbsp;Leslie A. Lenert","doi":"10.1016/j.ijmedinf.2024.105766","DOIUrl":"10.1016/j.ijmedinf.2024.105766","url":null,"abstract":"<div><h3>Objectives</h3><div> <!-->This scoping review aims to clarify the definition and trajectory of citizen-led scientific research (so-called citizen science) within the healthcare domain, examine the degree of integration of machine learning (ML) and the participation levels of citizen scientists in health-related projects.</div></div><div><h3>Materials and Methods</h3><div> <!-->In January and September 2024 we conducted a comprehensive search in PubMed, Scopus, Web of Science, and EBSCOhost platform for peer-reviewed publications that combine citizen science and machine learning (ML) in healthcare. Articles were excluded if citizens were merely passive data providers or if only professional scientists were involved.</div></div><div><h3>Results</h3><div>Out of an initial 1,395 screened, 56 articles spanning from 2013 to 2024 met the inclusion criteria. The majority of research projects were conducted in the U.S. (n = 20, 35.7 %), followed by Germany (n = 6, 10.7 %), with Spain, Canada, and the UK each contributing three studies (5.4 %). Data collection was the primary form of citizen scientist involvement (n = 29, 51.8 %), which included capturing images, sharing data online, and mailing samples. Data annotation was the next most common activity (n = 15, 26.8 %), followed by participation in ML model challenges (n = 8, 14.3 %) and decision-making contributions (n = 3, 5.4 %). Mosquitoes (n = 10, 34.5 %) and air pollution samples (n = 7, 24.2 %) were the main data objects collected by citizens for ML analysis. Classification tasks were the most prevalent ML method (n = 30, 52.6 %), with Convolutional Neural Networks being the most frequently used algorithm (n = 13, 20 %).</div></div><div><h3>Discussion and Conclusions</h3><div>Citizen science in healthcare is currently an American and European construct with growing expansion in Asia. Citizens are contributing data, and labeling data for ML methods, but only infrequently analyzing or leading studies. Projects that use “crowd-sourced” data and “citizen science” should be differentiated depending on the degree of involvement of citizens.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105766"},"PeriodicalIF":3.7,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142911150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of machine learning models for predicting venous thromboembolism in colorectal cancer patients: A cohort study in China 预测结直肠癌患者静脉血栓栓塞的机器学习模型的开发和验证:中国的一项队列研究。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-19 DOI: 10.1016/j.ijmedinf.2024.105770
Zuhai Hu , Xiaosheng Li , Yuliang Yuan , Qianjie Xu , Wei Zhang , Haike Lei
{"title":"Development and validation of machine learning models for predicting venous thromboembolism in colorectal cancer patients: A cohort study in China","authors":"Zuhai Hu ,&nbsp;Xiaosheng Li ,&nbsp;Yuliang Yuan ,&nbsp;Qianjie Xu ,&nbsp;Wei Zhang ,&nbsp;Haike Lei","doi":"10.1016/j.ijmedinf.2024.105770","DOIUrl":"10.1016/j.ijmedinf.2024.105770","url":null,"abstract":"<div><h3>Background</h3><div>With advancements in healthcare, traditional VTE risk assessment tools are increasingly insufficient to meet the demands of high-quality care, underscoring the need for innovative and specialized assessment methods.</div></div><div><h3>Objective</h3><div>Owing to the remarkable success of machine learning in supervised learning and disease prediction, our objective is to develop a reliable and efficient model for assessing VTE risk by leveraging the fundamental data and clinical characteristics of colorectal cancer patients within our medical facility.</div></div><div><h3>Methods</h3><div>Six commonly used machine learning algorithms were utilized in our study to predict the occurrence of VTE in patients with rectal cancer. In the modeling process, LASSO regression was employed to identify and exclude variables not associated with VTE. Additionally, hyperparameter tuning was conducted via 5-fold cross-validation to mitigate overfitting, and 200 bootstrap samples were used to adjust the apparent performance on the training set. The selection of the VTE assessment model was determined by a thorough evaluation of performance criteria, such as the AUC, ACC and F1 score.</div></div><div><h3>Results</h3><div>The RF model exhibits consistent and efficient performance. Specifically, in the internally validation dataset, where generalizability was adjusted, the RF model achieved the highest scores across multiple metrics: AD-AUC (0.895), AD-ACC (0.871), AD-F1 (0.311), AD-MCC (0.316), AD-Precision (0.241), AD-Specificity (0.888). For external validation on unseen colon cancer data, the RF model also performed best in terms of ACC (0.728), F1 (0.292), MCC (0.225), Precision (0.192), and Specificity (0.740), with a suboptimal AUC of 0.745 and a Sensitivity (Recall) of 0.615. Additionally, the RF model demonstrates strong performance not only on the original dataset but also on datasets processed via alternative imbalance handling techniques.</div></div><div><h3>Conclusions</h3><div>Our research successfully established and validated a risk assessment model for assessing the risk of VTE in colorectal cancer patients.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105770"},"PeriodicalIF":3.7,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised tooth segmentation from three dimensional scans of the dental arch using domain adaptation of synthetic data 基于合成数据域自适应的牙弓三维扫描无监督牙齿分割。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-19 DOI: 10.1016/j.ijmedinf.2024.105769
Md Sahadul Hasan Arian , Faisal Ahmed Sifat , Saif Ahmed , Nabeel Mohammed , Taseef Hasan Farook
{"title":"Unsupervised tooth segmentation from three dimensional scans of the dental arch using domain adaptation of synthetic data","authors":"Md Sahadul Hasan Arian ,&nbsp;Faisal Ahmed Sifat ,&nbsp;Saif Ahmed ,&nbsp;Nabeel Mohammed ,&nbsp;Taseef Hasan Farook","doi":"10.1016/j.ijmedinf.2024.105769","DOIUrl":"10.1016/j.ijmedinf.2024.105769","url":null,"abstract":"<div><h3>Background</h3><div>The automated segmentation of individual teeth from 3D models of the human dental arch is challenging due to variations in tooth alignment, arch form and overall maxillofacial anatomy. Domain adaptation is a specialised technique in deep learning which allows models to adapt to data from different domains, such as varying tooth and dental arch forms, without requiring human annotations.</div></div><div><h3>Purpose</h3><div>This study aimed to segment individual teeth from various dental arch morphologies in 3D intraoral scans using domain adaptation.</div></div><div><h3>Materials and Methods</h3><div>Twenty scanned dental arches from various age groups and developmental stages were used to generate 20 simplified synthetic variants of the scans. These synthetic variants, along with 16 natural scanned dental arches, were used to train the deep learning models. Domain adaptation was employed using Gradient Reversal Layer and Siamese Network techniques. The PointNet and PointNet++ model backbones were trained to align the latent space distribution of real and synthetic domains. Validations were performed on four unseen natural scanned arches, with and without domain adaptation enabled, to evaluate whether a 3D deep neural network can be trained without any human-annotated 3D models.</div></div><div><h3>Results</h3><div>PointNet and PointNet++ models demonstrated a mean intersection-over-union between 0.34 and 0.36 mIoU without domain adaptation enabled and 0.80 and 0.95 mIoU, respectively with domain adaptation enabled when assessing natural scanned dental arches.</div></div><div><h3>Conclusion</h3><div>Domain adaptation techniques can enable training a segmentation deep learning model using synthetically generated 3D jaw scans without requiring human operators annotating the training data<strong>.</strong></div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105769"},"PeriodicalIF":3.7,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OptimCLM: Optimizing clinical language models for predicting patient outcomes via knowledge distillation, pruning and quantization OptimCLM:优化临床语言模型,通过知识蒸馏,修剪和量化来预测患者结果。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-18 DOI: 10.1016/j.ijmedinf.2024.105764
Mohammad Junayed Hasan , Fuad Rahman , Nabeel Mohammed
{"title":"OptimCLM: Optimizing clinical language models for predicting patient outcomes via knowledge distillation, pruning and quantization","authors":"Mohammad Junayed Hasan ,&nbsp;Fuad Rahman ,&nbsp;Nabeel Mohammed","doi":"10.1016/j.ijmedinf.2024.105764","DOIUrl":"10.1016/j.ijmedinf.2024.105764","url":null,"abstract":"<div><h3>Background</h3><div>Clinical Language Models (CLMs) possess the potential to reform traditional healthcare systems by aiding in clinical decision making and optimal resource utilization. They can enhance patient outcomes and help healthcare management through predictive clinical tasks. However, their real-world deployment is limited due to high computational cost at inference, in terms of both time and space complexity.</div></div><div><h3>Objective</h3><div>This study aims to develop and optimize an efficient framework that compresses CLMs without significant performance loss, reducing inference time and disk-space, and enabling real-world clinical applications.</div></div><div><h3>Methods</h3><div>We introduce OptimCLM, a framework for optimizing CLMs with ensemble learning, knowledge distillation (KD), pruning and quantization. Based on domain-knowledge and performance, we select and combine domain-adaptive CLMs DischargeBERT and COReBERT as the teacher ensemble model. We transfer the teacher's knowledge to two smaller generalist models, BERT-PKD and TinyBERT, and apply black-box KD, post-training unstructured pruning and post-training 8-bit model quantization to them. In an admission-to-discharge setting, we evaluate the framework on four clinical outcome prediction tasks (length of stay prediction, mortality prediction, diagnosis prediction and procedure prediction) using admission notes from the MIMIC-III clinical database.</div></div><div><h3>Results</h3><div>The OptimCLM framework achieved up to <strong>22.88</strong>× compression ratio and <strong>28.7</strong>× inference speedup, with less than <strong>5%</strong> and <strong>2%</strong> loss in macro-averaged AUROC for TinyBERT and BERT-PKD, respectively. The teacher model outperformed five state-of-the-art models on all tasks. The optimized BERT-PKD model also outperformed them in most tasks.</div></div><div><h3>Conclusion</h3><div>Our findings suggest that domain-specific fine-tuning with ensemble learning and KD is more effective than domain-specific pre-training for domain-knowledge transfer and text classification tasks. Thus, this work demonstrates the feasibility and potential of deploying optimized CLMs in healthcare settings and developing them with less computational resources.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105764"},"PeriodicalIF":3.7,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A meta-analysis of AI and machine learning in project management: Optimizing vaccine development for emerging viral threats in biotechnology 项目管理中人工智能和机器学习的荟萃分析:优化生物技术中新出现的病毒威胁的疫苗开发。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-18 DOI: 10.1016/j.ijmedinf.2024.105768
Jatin Vaghasiya , Mahim Khan , Tarak Milan Bakhda
{"title":"A meta-analysis of AI and machine learning in project management: Optimizing vaccine development for emerging viral threats in biotechnology","authors":"Jatin Vaghasiya ,&nbsp;Mahim Khan ,&nbsp;Tarak Milan Bakhda","doi":"10.1016/j.ijmedinf.2024.105768","DOIUrl":"10.1016/j.ijmedinf.2024.105768","url":null,"abstract":"<div><h3>Objectives</h3><div>Artificial Intelligence (AI) and Machine Learning (ML) have emerged as transformative technologies across various industries, including healthcare, biotechnology, and vaccine development. These technologies offer immense potential to improve project management efficiency, decision-making, and resource utilization, especially in complex tasks such as vaccine development and healthcare innovations.</div></div><div><h3>Methods</h3><div>A systematic <em>meta</em>-analysis was conducted by reviewing studies from databases like PubMed, IEEE Xplore, Scopus, Web of Science, EMBASE, and Google Scholar until September 2024. The analysis focused on the application of AI and ML in project management for vaccine development, biotechnology, and broader healthcare innovations using the PICO framework to guide study selection and inclusion. Statistical analyses were performed using Review Manager 5.4 and Comprehensive Meta-Analysis (CMA) software.</div></div><div><h3>Results</h3><div>The <em>meta</em>-analysis reviewed 44 studies examining the integration of Artificial Intelligence (AI) and Machine Learning (ML) in healthcare, biotechnology, and vaccine development project management. Results demonstrated significant improvements in efficiency, resource allocation, decision-making, and risk management. AI/ML applications notably accelerated vaccine development, from candidate identification to clinical trial optimization, and improved predictive modeling for efficacy and safety. Subgroup analysis revealed variations in effectiveness across healthcare sectors, with the highest pooled effect sizes observed in infectious disease control (1.2; 95 % CI: 0.85–1.50) compared to medical imaging (0.85; 95 % CI: 0.75–0.95). Studies employing AI techniques demonstrated a pooled effect size of 0.83 (95 % CI: 0.78–1.08). Despite the observed high heterogeneity (I<sup>2</sup> = 99.04 %) and moderate-to-high risks of bias, sensitivity analyses confirmed the robustness of the findings. Overall, AI/ML integration offers transformative potential to enhance project management and vaccine development, driving innovation and efficiency in these critical fields.</div></div><div><h3>Conclusion</h3><div>AI and ML technologies show significant potential to transform project management practices in healthcare, biotechnology, and vaccine development by enhancing efficiency, predictive analytics, and decision-making capabilities. Their integration paves the way for more innovative, data-driven solutions that can adapt to evolving challenges in these fields.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105768"},"PeriodicalIF":3.7,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time assistance in suicide prevention helplines using a deep learning-based recommender system: A randomized controlled trial 基于深度学习的推荐系统在自杀预防热线中的实时帮助:一项随机对照试验。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-17 DOI: 10.1016/j.ijmedinf.2024.105760
Salim Salmi , Saskia Mérelle , Nikki van Eijk , Renske Gilissen , Rob van der Mei , Sandjai Bhulai
{"title":"Real-time assistance in suicide prevention helplines using a deep learning-based recommender system: A randomized controlled trial","authors":"Salim Salmi ,&nbsp;Saskia Mérelle ,&nbsp;Nikki van Eijk ,&nbsp;Renske Gilissen ,&nbsp;Rob van der Mei ,&nbsp;Sandjai Bhulai","doi":"10.1016/j.ijmedinf.2024.105760","DOIUrl":"10.1016/j.ijmedinf.2024.105760","url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate the effectiveness and usability of an AI-assisted tool in providing real-time assistance to counselors during suicide prevention helpline conversations.</div></div><div><h3>Methods</h3><div>In this RCT, the intervention group used an AI-assisted tool, which generated suggestions based on sentence embeddings (i.e. BERT) from previous successful counseling sessions. Cosine similarity was used to present the top 5 chat situation to the counsellors. The control group did not have access to the tool (care as usual). Both groups completed a questionnaire assessing their self-efficacy at the end of each shift. Counselors' usage of the tool was evaluated by measuring frequency, duration and content of interactions.</div></div><div><h3>Results</h3><div>In total, 48 counselors participated in the experiment: 27 counselors in the experimental condition and 21 counselors in the control condition. Together they rated 188 shifts. No significant difference in self-efficacy was observed between the two groups (p=0.36). However, counselors that used the AI-assisted tool had marginally lower response time and used the tool more often during conversations that had a longer duration. A deeper analysis of usage showed that the tool was frequently used in inappropriate situations, e.g. after the counselor had already provided a response to the help-seeker, defeating the purpose of the information. When the tool was employed appropriately (64 conversations), it provided usable information in 53 conversations (83%). However, counselors used the tool less frequently at optimal moments, indicating their potential lack of proficiency with using AI-assisted tools during helpline conversations or initial trust issues with the system.</div></div><div><h3>Conclusion</h3><div>The study demonstrates benefits and pitfalls of integrating AI-assisted tools in suicide prevention for improving counselor support. Despite the lack of significant impact on self-efficacy, the support tool provided usable suggestions and the frequent use during long conversations suggests counsellors may wish to use the tool in complex or challenging interactions.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105760"},"PeriodicalIF":3.7,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthetic data generation in healthcare: A scoping review of reviews on domains, motivations, and future applications 医疗保健中的合成数据生成:对领域、动机和未来应用程序的审查进行范围界定。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-17 DOI: 10.1016/j.ijmedinf.2024.105763
Miguel Rujas, Rodrigo Martín Gómez del Moral Herranz, Giuseppe Fico , Beatriz Merino-Barbancho
{"title":"Synthetic data generation in healthcare: A scoping review of reviews on domains, motivations, and future applications","authors":"Miguel Rujas,&nbsp;Rodrigo Martín Gómez del Moral Herranz,&nbsp;Giuseppe Fico ,&nbsp;Beatriz Merino-Barbancho","doi":"10.1016/j.ijmedinf.2024.105763","DOIUrl":"10.1016/j.ijmedinf.2024.105763","url":null,"abstract":"<div><h3>Background</h3><div>The development of Artificial Intelligence in the healthcare sector is generating a great impact. However, one of the primary challenges for the implementation of this technology is the access to high-quality data due to issues in data collection and regulatory constraints, for which synthetic data is an emerging alternative. While previous research has reviewed synthetic data generation techniques, there is limited focus on their applications and the motivations driving their synthesis. A comprehensive review is needed to expand the potential of synthetic data into less explored healthcare areas.</div></div><div><h3>Objective</h3><div>This review aims to identify the healthcare domains where synthetic data are currently generated, the motivations behind their creation, their future uses, limitations, and types of data.</div></div><div><h3>Materials and methods</h3><div>Following the PRISMA-ScR framework, this review analysed literature from the last 10 years within PubMed, Scopus, and Web of Science. Reviews containing information on synthetic data generation in healthcare were screened and analysed. Key healthcare domains, motivations, future uses, and gaps in the literature were identified through a structured data extraction process.</div></div><div><h3>Results</h3><div>Of the 346 reviews identified, 42 were included for data extraction. Thirteen main domains were identified, with Oncology, Neurology, and Cardiology being the most frequently mentioned. Five primary motivations for synthetic data generation and three major categories of future applications were highlighted. Additionally, unstructured data, particularly images, were found to be the predominant type of synthetic data generated.</div></div><div><h3>Discussion and conclusion</h3><div>Synthetic data are currently being generated across diverse healthcare domains, showcasing their adaptability and potential. Despite their early stage, synthetic data technologies hold significant promise for future applications. Expanding their use into new domains and less common data types (e.g., video and text) could further enhance their impact. Future work should focus on developing evaluation benchmarks and standardized generative models tailored to specific healthcare domains.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105763"},"PeriodicalIF":3.7,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data 揭露变色龙:医疗表格数据中分布外检测的基准。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-17 DOI: 10.1016/j.ijmedinf.2024.105762
Mohammad Azizmalayeri , Ameen Abu-Hanna , Giovanni Cinà
{"title":"Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data","authors":"Mohammad Azizmalayeri ,&nbsp;Ameen Abu-Hanna ,&nbsp;Giovanni Cinà","doi":"10.1016/j.ijmedinf.2024.105762","DOIUrl":"10.1016/j.ijmedinf.2024.105762","url":null,"abstract":"<div><h3>Background</h3><div>Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs known as out-of-distribution (OOD) data. These concerns can be addressed by real-time detection of OOD inputs. While numerous OOD detection approaches have been suggested in other fields - especially in computer vision - it remains unclear whether similar methods effectively address challenges posed by medical tabular data.</div></div><div><h3>Objective</h3><div>To answer this important question, we propose an extensive reproducible benchmark to compare different OOD detection methods in medical tabular data across a comprehensive suite of tests.</div></div><div><h3>Method</h3><div>To achieve this, we leverage 4 different and large public medical datasets, including eICU and MIMIC-IV, and consider various kinds of OOD cases within these datasets. For example, we examine OODs originating from a statistically different dataset than the training set according to the membership model introduced by Debray et al. <span><span>[1]</span></span>, as well as OODs obtained by splitting a given dataset based on a value of a distinguishing variable. To identify OOD instances, we explore a range of 10 density-based methods that learn the marginal distribution of the data, alongside 17 post-hoc detectors that are applied on top of prediction models already trained on the data. The prediction models involve three distinct architectures, namely MLP, ResNet, and Transformer.</div></div><div><h3>Main results</h3><div>In our experiments, when the membership model achieved an AUC of 0.98, which indicated a clear distinction between OOD data and the training set, we observed that the OOD detection methods had achieved AUC values exceeding 0.95 in distinguishing OOD data. In contrast, in the experiments with subtler changes in data distribution such as selecting OOD data based on ethnicity and age characteristics, many OOD detection methods performed similarly to a random classifier with AUC values close to 0.5. This may suggest a correlation between separability, as indicated by the membership model, and OOD detection performance, as indicated by the AUC of the detection model. This warrants future research.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105762"},"PeriodicalIF":3.7,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The determinants of help-seeking behaviors among cancer patients in online health communities: Evidence from China 在线健康社区中癌症患者寻求帮助行为的决定因素:来自中国的证据
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-16 DOI: 10.1016/j.ijmedinf.2024.105767
Xiandong Feng , Yinhuan Hu , Holger Pfaff , Sha Liu , Hui Wang , Zhen Qi
{"title":"The determinants of help-seeking behaviors among cancer patients in online health communities: Evidence from China","authors":"Xiandong Feng ,&nbsp;Yinhuan Hu ,&nbsp;Holger Pfaff ,&nbsp;Sha Liu ,&nbsp;Hui Wang ,&nbsp;Zhen Qi","doi":"10.1016/j.ijmedinf.2024.105767","DOIUrl":"10.1016/j.ijmedinf.2024.105767","url":null,"abstract":"<div><h3>Objective</h3><div>Although online health communities offer a new approach to patient interaction, the help-seeking behaviors of cancer patients within these platforms remain unexplored. This study aims to identify the determinants influencing online help-seeking behaviors among cancer patients.</div></div><div><h3>Method</h3><div>Based on motivation theory, we proposed six hypotheses and developed a research model. Data were collected from 1100 cancer patients who sought help in a leading Chinese online cancer community in March, June, and September 2023. We used the fixed-effect negative binomial model to test research hypotheses.</div></div><div><h3>Results</h3><div>The findings indicated that the time since diagnosis (β = -0.127, P &lt; 0.001) was negatively associated with online help-seeking behaviors among cancer patients. In contrast, social support (β = 0.002, P = 0.003) and disease stigma (β = 0.170, P &lt; 0.001) positively influenced their help-seeking behaviors in online health communities. Furthermore, while male and female cancer patients showed decreased help-seeking behaviors as time since diagnosis increased, the decline was less pronounced for females (β = 0.040, P &lt; 0.001). The positive impact of disease stigma on help-seeking behaviors is stronger for female patients than male patients (β = 0.098, P &lt; 0.001).</div></div><div><h3>Conclusion</h3><div>This research broadens the understanding of how cancer patients seek help in digital environments and enhances theoretical insights into these behaviors.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105767"},"PeriodicalIF":3.7,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training machine learning models to detect rare inborn errors of metabolism (IEMs) based on GC–MS urinary metabolomics for diseases screening 训练机器学习模型以检测基于GC-MS尿液代谢组学的罕见先天性代谢错误(IEMs),用于疾病筛查。
IF 3.7 2区 医学
International Journal of Medical Informatics Pub Date : 2024-12-16 DOI: 10.1016/j.ijmedinf.2024.105765
Haomin Li , Siyuan Gao , Dan Wu , Min Zhu , Zhenzhen Hu , Kexin Fang , Xiuru Chen , Zhou Ni , Jing Li , Beibei Zhao , Xuhui She , Xinwen Huang
{"title":"Training machine learning models to detect rare inborn errors of metabolism (IEMs) based on GC–MS urinary metabolomics for diseases screening","authors":"Haomin Li ,&nbsp;Siyuan Gao ,&nbsp;Dan Wu ,&nbsp;Min Zhu ,&nbsp;Zhenzhen Hu ,&nbsp;Kexin Fang ,&nbsp;Xiuru Chen ,&nbsp;Zhou Ni ,&nbsp;Jing Li ,&nbsp;Beibei Zhao ,&nbsp;Xuhui She ,&nbsp;Xinwen Huang","doi":"10.1016/j.ijmedinf.2024.105765","DOIUrl":"10.1016/j.ijmedinf.2024.105765","url":null,"abstract":"<div><h3>Background</h3><div>Gas chromatography-mass spectrometry (GC–MS) has been shown to be a potentially efficient metabolic profiling platform in urine analysis. However, the widespread use of GC–MS for inborn errors of metabolism (IEM) screening is constrained by the rarity of IEM in population, and the difficult and specialized complexity of the interpretation of GC–MS organic acid profiles.</div></div><div><h3>Methods</h3><div>Based on 355,197 GC–MS test cases accumulated from 2013 to 2021 in China, a random forest-based machine learning model was proposed, trained, and evaluated. Weighted undersampling or oversampling data processing and staged modeling strategies were used to handle the highly imbalanced data and improve the ability of the model to identify different types of rare IEM cases.</div></div><div><h3>Result</h3><div>In the first-stage model, which only identified positive cases without discriminating the specific IEM, the screening sensitivity was 0.938 (or 0.991 if abnormal cases were also included). The average sensitivity of the second-stage models that classify 11 particular IEMs is 0.992, with an average specificity and accuracy of 0.944 and 0.969, respectively. The SHAP values visualized for each model explain the basis for the differential diagnosis made by the model.</div></div><div><h3>Conclusion</h3><div>With sufficient high-quality data, machine learning models can provide high-sensitivity GC–MS interpretation and greatly improve the efficiency and quality of GC–MS based IEM screening.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105765"},"PeriodicalIF":3.7,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信