特邀社论:医疗保健中的大数据和人工智能。

IF 2.8 Q3 ENGINEERING, BIOMEDICAL
Tim Hulsen, Francesca Manni
{"title":"特邀社论:医疗保健中的大数据和人工智能。","authors":"Tim Hulsen,&nbsp;Francesca Manni","doi":"10.1049/htl2.12086","DOIUrl":null,"url":null,"abstract":"<p>Big data refers to large datasets that can be mined and analysed using data science, statistics or machine learning (ML), often without defining a hypothesis upfront [<span>1</span>]. Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, which can use these big data to find patterns, to make predictions and even to generate new data or information [<span>2</span>]. Big data has been used to improve healthcare [<span>3</span>] and medicine [<span>1</span>] already for many years, by enabling researchers and medical professionals to draw conclusions from large and rich datasets rather than from clinical trials based on a small number of patients. More recently, AI has been used in healthcare as well, for example by finding and classifying tumours in magnetic resonance images (MRI) [<span>4</span>] or by improving and automating the clinical workflow [<span>5</span>]. This uptake of AI in healthcare is still increasing, as new models and techniques are being introduced. For example, the creation of large language models (LLMs) such as ChatGPT enables the use of generative AI (GenAI) in healthcare [<span>6</span>]. GenAI can be used to create synthetic data (where the original data has privacy issues), generate radiology or pathology reports, or create chatbots to interact with the patient. The expectation is that the application of AI in healthcare will get even more important, as hospitals are suffering from personnel shortages and increasing numbers of elderly people needing care. The rise of AI in healthcare also comes with some challenges. Especially in healthcare, we want to know what the AI algorithm is doing; it should not be a ‘black box’. Explainable AI (XAI) can help the medical professional (or even the patient) to understand why the AI algorithm makes a certain decision, increasing trust in the result or prediction [<span>7</span>]. It is also important that AI works according to privacy laws, is free from bias, and does not produce toxic language (in case of a medical chatbot). Responsible AI (RAI) tries to prevent these issues by providing a framework of ethical principles [<span>8</span>]. By embracing the (current and future) technical possibilities AI has to offer, and at the same time making sure that AI is explainable and responsible, we can make sure that hospitals are able to withstand any future challenges.</p><p>This Special Issue contains six papers, all of which underwent peer review. One paper is about increasing the transparency of machine learning models, one is about cardiac disease risk prediction, and another one is about depression detection in Roman Urdu social media posts. The other papers are about autism spectrum disorder detection using facial images, hybrid brain tumour classification of histopathology hyperspectral images, and prediction of the utilization of invasive and non-invasive ventilation throughout the intensive care unit (ICU) duration.</p><p>Lisboa discusses in ‘Open your black box classifier’ [<span>9</span>] that the transparency of machine learning (ML) models is central to good practice when they are applied in high-risk applications. Recent developments make this feasible for tabular data (Excel, CSV etc.), which is prevalent in risk modelling and computer-based decision support across multiple domains including healthcare. The author outlines important motivating factors for interpretability and summarizes practical approaches, pointing out the main methods available. The main finding is that any black box classifier making probabilistic predictions of class membership from data in tabular form can be represented with a globally interpretable model without performance loss.</p><p>In ‘Cardiac Disease Risk Prediction using Machine Learning Algorithms’ [<span>10</span>], Stonier et al. try to create a ML system that is used for predicting whether a patient is likely to develop heart attacks, by analyzing various data sources including electronic health records (EHR) and clinical diagnosis reports from hospital clinics. Various algorithms such as RF, regression models, K-nearest neighbour (KNN), Naïve Bayes algorithm etc., are compared. Their RF algorithm provides a high accuracy (88.52%) in forecasting heart attack risk, which could herald a revolution in the diagnosis and treatment of cardiovascular illnesses.</p><p>Rehmani et al. argue in ‘Depression Detection with Machine Learning of Structural and Non-Structural Dual Languages’ [<span>11</span>] that depression is a painful and serious mental state, which has an adversarial impact on human thoughts, feeling, and actions. Their study aims to create a dataset of social media posts in the Roman Urdu language, to predict the risk of depression in Roman Urdu as well as English. For Roman Urdu, English language data has been obtained from Facebook, which was manually converted into Roman Urdu. English comments were obtained from Kaggle. Machine learning models, including Support Vector Machine (SVM), Support Vector Machine Radial Basis Function (SVM RBF), Random Forest (RF), and BERT, were investigated. The risk of depression was classified into three categories: not depressed, moderate depression, and severe depression. Out of these four models, SVM achieved the best result with an accuracy of 84%. Their work refines the area of depression prediction, particularly in Asian countries.</p><p>‘Autism Spectrum Disorder Detection using Facial Images: A Performance Comparison of Pretrained Convolutional Neural Networks’ [<span>12</span>] by Ahmad et al. discusses that studies have shown that early detection of ASD can assist in maintaining the behavioural and psychological development of children. Experts are currently studying various ML methods, particularly CNNs, to expedite the screening process. CNNs are considered promising frameworks for the diagnosis of ASD. Different pre-trained CNNs such as ResNet34, ResNet50, AlexNet, MobileNetV2, VGG16, and VGG19 were employed to diagnose ASD, and their performance was compared. The authors applied transfer learning to every model included in the study to achieve higher results than the initial models. The proposed ResNet50 model achieved the highest accuracy of 92%. The proposed method also outperformed the state-of-the-art models in terms of accuracy and computational cost.</p><p>Cruz-Guerrero et al. discuss in ‘Hybrid Brain Tumor Classification of Histopathology Hyperspectral Images by Linear Unmixing and an Ensemble of Deep Neural Networks’ [<span>13</span>] that hyperspectral imaging (HSI) has demonstrated its potential to provide correlated spatial and spectral information of a sample by a non-contact and non-invasive technology. In the medical field, especially in histopathology, HSI has been applied for the classification and identification of diseased tissue and for the characterization of its morphological properties. The authors propose a hybrid scheme to classify non-tumour and tumour histological brain samples by HSI. The proposed approach is based on the identification of characteristic components in a hyperspectral image by linear unmixing, as a features engineering step, and the subsequent classification by a deep learning approach. For this last step, an ensemble of deep neural networks is evaluated by a cross-validation scheme on an augmented dataset and a transfer learning scheme. The proposed method can classify histological brain samples with an average accuracy of 88%, and reduced variability, computational cost, and inference times, which presents an advantage over methods in the state-of-the-art. Therefore, their work demonstrates the potential of hybrid classification methodologies to achieve robust and reliable results by combining linear unmixing for features extraction and deep learning for classification.</p><p>Finally, in ‘Machine learning modeling for predicting the utilization of invasive and non-invasive ventilation throughout the ICU duration’ [<span>14</span>], Schwager et al. present a machine learning model to predict the need for both invasive and non-invasive mechanical ventilation in ICU patients. Using the Philips eICU Research Institute (ERI) database, 2.6 million ICU patient data from 2010 to 2019 were analyzed. Additionally, an external test set from a single hospital from this database was used to assess the model's generalizability. Model performance was determined by comparing the model probability predictions with the actual incidence of ventilation use, either invasive or non-invasive. The model demonstrated a prediction performance with an AUC of 0.921 for overall ventilation, 0.937 for invasive, and 0.827 for non-invasive. Factors such as high Glasgow Coma Scores, younger age, lower body mass index (BMI), and lower partial pressure of carbon dioxide (PaCO2) were highlighted as indicators of a lower likelihood for the need for ventilation. The model can serve as a retrospective benchmarking tool for hospitals to assess ICU performance concerning mechanical ventilation necessity. It also enables analysis of ventilation strategy trends and risk-adjusted comparisons, with potential for future testing as a clinical decision tool for optimizing ICU ventilation management.</p><p></p><p><b>Tim Hulsen</b> is a Senior Data &amp; AI Scientist with a broad experience in both academia and industry, working on a wide range of projects, mostly in oncology. After receiving his MSc in biology in 2001, he obtained a PhD in bioinformatics in 2007 from a collaboration between the Radboud University Nijmegen and the pharma company N.V. Organon. After 2 years post-doc at the Radboud University Nijmegen, he moved to Philips Research in 2009, where he worked on biomarker discovery for 1 year, before moving to the data management and data science field, working on big data projects in oncology, such as Prostate Cancer Molecular Medicine (PCMM), Translational Research IT (TraIT), Movember Global Action Plan 3 (GAP3), the European Randomized Study of Screening for Prostate Cancer (ERSPC), and Liquid Biopsies and Imaging (LIMA). His most recent projects are ReIMAGINE, which is about the use of imaging to prevent unnecessary biopsies in prostate cancer, and SMART-BEAR, which is about the development of an innovative platform to support the healthy and independent living of elderly people. He is the author of several publications around big data, data management, data science, and artificial intelligence in the context of healthcare and medicine.</p><p></p><p><b>Francesca Manni</b> is currently a Clinical Scientist at Philips and guest researcher at Eindhoven University of Technology. Francesca has a background in biomedical engineering and computer vision with a focus on AI for medical imaging, by finalizing a PhD at Eindhoven University of Technology, where she has worked in close collaboration with leading EU hospitals. She focused on the development and application of novel imaging/sensing technologies such as hyperspectral imaging, to build specific solutions for tumour detection and minimally invasive surgery. Her dissertation work resulted in applying novel algorithms for patient tracking during spinal surgery and cancer detection. After that, she was AI &amp; Data Scientist at Philips Research from 2021 to 2023. She has worked within the AI for vision field, enabling AI solutions in healthcare as well as for the deployment of AI algorithms in many European hospitals. During this period, she has led the Healthcare group at the Big Data Value Association (BDVA). Francesca's research is reported in numerous international peer-reviewed scientific journals and top international conference proceedings in the field of computer vision, image-guided interventions, and AI privacy preserving techniques.</p><p><b>Tim Hulsen</b>: Writing—original draft; writing—review and editing. <b>Francesca Manni</b>: Writing—review and editing.</p><p>The authors declare no conflict of interest.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 4","pages":"207-209"},"PeriodicalIF":2.8000,"publicationDate":"2024-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294927/pdf/","citationCount":"0","resultStr":"{\"title\":\"Guest Editorial: Big data and artificial intelligence in healthcare\",\"authors\":\"Tim Hulsen,&nbsp;Francesca Manni\",\"doi\":\"10.1049/htl2.12086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Big data refers to large datasets that can be mined and analysed using data science, statistics or machine learning (ML), often without defining a hypothesis upfront [<span>1</span>]. Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, which can use these big data to find patterns, to make predictions and even to generate new data or information [<span>2</span>]. Big data has been used to improve healthcare [<span>3</span>] and medicine [<span>1</span>] already for many years, by enabling researchers and medical professionals to draw conclusions from large and rich datasets rather than from clinical trials based on a small number of patients. More recently, AI has been used in healthcare as well, for example by finding and classifying tumours in magnetic resonance images (MRI) [<span>4</span>] or by improving and automating the clinical workflow [<span>5</span>]. This uptake of AI in healthcare is still increasing, as new models and techniques are being introduced. For example, the creation of large language models (LLMs) such as ChatGPT enables the use of generative AI (GenAI) in healthcare [<span>6</span>]. GenAI can be used to create synthetic data (where the original data has privacy issues), generate radiology or pathology reports, or create chatbots to interact with the patient. The expectation is that the application of AI in healthcare will get even more important, as hospitals are suffering from personnel shortages and increasing numbers of elderly people needing care. The rise of AI in healthcare also comes with some challenges. Especially in healthcare, we want to know what the AI algorithm is doing; it should not be a ‘black box’. Explainable AI (XAI) can help the medical professional (or even the patient) to understand why the AI algorithm makes a certain decision, increasing trust in the result or prediction [<span>7</span>]. It is also important that AI works according to privacy laws, is free from bias, and does not produce toxic language (in case of a medical chatbot). Responsible AI (RAI) tries to prevent these issues by providing a framework of ethical principles [<span>8</span>]. By embracing the (current and future) technical possibilities AI has to offer, and at the same time making sure that AI is explainable and responsible, we can make sure that hospitals are able to withstand any future challenges.</p><p>This Special Issue contains six papers, all of which underwent peer review. One paper is about increasing the transparency of machine learning models, one is about cardiac disease risk prediction, and another one is about depression detection in Roman Urdu social media posts. The other papers are about autism spectrum disorder detection using facial images, hybrid brain tumour classification of histopathology hyperspectral images, and prediction of the utilization of invasive and non-invasive ventilation throughout the intensive care unit (ICU) duration.</p><p>Lisboa discusses in ‘Open your black box classifier’ [<span>9</span>] that the transparency of machine learning (ML) models is central to good practice when they are applied in high-risk applications. Recent developments make this feasible for tabular data (Excel, CSV etc.), which is prevalent in risk modelling and computer-based decision support across multiple domains including healthcare. The author outlines important motivating factors for interpretability and summarizes practical approaches, pointing out the main methods available. The main finding is that any black box classifier making probabilistic predictions of class membership from data in tabular form can be represented with a globally interpretable model without performance loss.</p><p>In ‘Cardiac Disease Risk Prediction using Machine Learning Algorithms’ [<span>10</span>], Stonier et al. try to create a ML system that is used for predicting whether a patient is likely to develop heart attacks, by analyzing various data sources including electronic health records (EHR) and clinical diagnosis reports from hospital clinics. Various algorithms such as RF, regression models, K-nearest neighbour (KNN), Naïve Bayes algorithm etc., are compared. Their RF algorithm provides a high accuracy (88.52%) in forecasting heart attack risk, which could herald a revolution in the diagnosis and treatment of cardiovascular illnesses.</p><p>Rehmani et al. argue in ‘Depression Detection with Machine Learning of Structural and Non-Structural Dual Languages’ [<span>11</span>] that depression is a painful and serious mental state, which has an adversarial impact on human thoughts, feeling, and actions. Their study aims to create a dataset of social media posts in the Roman Urdu language, to predict the risk of depression in Roman Urdu as well as English. For Roman Urdu, English language data has been obtained from Facebook, which was manually converted into Roman Urdu. English comments were obtained from Kaggle. Machine learning models, including Support Vector Machine (SVM), Support Vector Machine Radial Basis Function (SVM RBF), Random Forest (RF), and BERT, were investigated. The risk of depression was classified into three categories: not depressed, moderate depression, and severe depression. Out of these four models, SVM achieved the best result with an accuracy of 84%. Their work refines the area of depression prediction, particularly in Asian countries.</p><p>‘Autism Spectrum Disorder Detection using Facial Images: A Performance Comparison of Pretrained Convolutional Neural Networks’ [<span>12</span>] by Ahmad et al. discusses that studies have shown that early detection of ASD can assist in maintaining the behavioural and psychological development of children. Experts are currently studying various ML methods, particularly CNNs, to expedite the screening process. CNNs are considered promising frameworks for the diagnosis of ASD. Different pre-trained CNNs such as ResNet34, ResNet50, AlexNet, MobileNetV2, VGG16, and VGG19 were employed to diagnose ASD, and their performance was compared. The authors applied transfer learning to every model included in the study to achieve higher results than the initial models. The proposed ResNet50 model achieved the highest accuracy of 92%. The proposed method also outperformed the state-of-the-art models in terms of accuracy and computational cost.</p><p>Cruz-Guerrero et al. discuss in ‘Hybrid Brain Tumor Classification of Histopathology Hyperspectral Images by Linear Unmixing and an Ensemble of Deep Neural Networks’ [<span>13</span>] that hyperspectral imaging (HSI) has demonstrated its potential to provide correlated spatial and spectral information of a sample by a non-contact and non-invasive technology. In the medical field, especially in histopathology, HSI has been applied for the classification and identification of diseased tissue and for the characterization of its morphological properties. The authors propose a hybrid scheme to classify non-tumour and tumour histological brain samples by HSI. The proposed approach is based on the identification of characteristic components in a hyperspectral image by linear unmixing, as a features engineering step, and the subsequent classification by a deep learning approach. For this last step, an ensemble of deep neural networks is evaluated by a cross-validation scheme on an augmented dataset and a transfer learning scheme. The proposed method can classify histological brain samples with an average accuracy of 88%, and reduced variability, computational cost, and inference times, which presents an advantage over methods in the state-of-the-art. Therefore, their work demonstrates the potential of hybrid classification methodologies to achieve robust and reliable results by combining linear unmixing for features extraction and deep learning for classification.</p><p>Finally, in ‘Machine learning modeling for predicting the utilization of invasive and non-invasive ventilation throughout the ICU duration’ [<span>14</span>], Schwager et al. present a machine learning model to predict the need for both invasive and non-invasive mechanical ventilation in ICU patients. Using the Philips eICU Research Institute (ERI) database, 2.6 million ICU patient data from 2010 to 2019 were analyzed. Additionally, an external test set from a single hospital from this database was used to assess the model's generalizability. Model performance was determined by comparing the model probability predictions with the actual incidence of ventilation use, either invasive or non-invasive. The model demonstrated a prediction performance with an AUC of 0.921 for overall ventilation, 0.937 for invasive, and 0.827 for non-invasive. Factors such as high Glasgow Coma Scores, younger age, lower body mass index (BMI), and lower partial pressure of carbon dioxide (PaCO2) were highlighted as indicators of a lower likelihood for the need for ventilation. The model can serve as a retrospective benchmarking tool for hospitals to assess ICU performance concerning mechanical ventilation necessity. It also enables analysis of ventilation strategy trends and risk-adjusted comparisons, with potential for future testing as a clinical decision tool for optimizing ICU ventilation management.</p><p></p><p><b>Tim Hulsen</b> is a Senior Data &amp; AI Scientist with a broad experience in both academia and industry, working on a wide range of projects, mostly in oncology. After receiving his MSc in biology in 2001, he obtained a PhD in bioinformatics in 2007 from a collaboration between the Radboud University Nijmegen and the pharma company N.V. Organon. After 2 years post-doc at the Radboud University Nijmegen, he moved to Philips Research in 2009, where he worked on biomarker discovery for 1 year, before moving to the data management and data science field, working on big data projects in oncology, such as Prostate Cancer Molecular Medicine (PCMM), Translational Research IT (TraIT), Movember Global Action Plan 3 (GAP3), the European Randomized Study of Screening for Prostate Cancer (ERSPC), and Liquid Biopsies and Imaging (LIMA). His most recent projects are ReIMAGINE, which is about the use of imaging to prevent unnecessary biopsies in prostate cancer, and SMART-BEAR, which is about the development of an innovative platform to support the healthy and independent living of elderly people. He is the author of several publications around big data, data management, data science, and artificial intelligence in the context of healthcare and medicine.</p><p></p><p><b>Francesca Manni</b> is currently a Clinical Scientist at Philips and guest researcher at Eindhoven University of Technology. Francesca has a background in biomedical engineering and computer vision with a focus on AI for medical imaging, by finalizing a PhD at Eindhoven University of Technology, where she has worked in close collaboration with leading EU hospitals. She focused on the development and application of novel imaging/sensing technologies such as hyperspectral imaging, to build specific solutions for tumour detection and minimally invasive surgery. Her dissertation work resulted in applying novel algorithms for patient tracking during spinal surgery and cancer detection. After that, she was AI &amp; Data Scientist at Philips Research from 2021 to 2023. She has worked within the AI for vision field, enabling AI solutions in healthcare as well as for the deployment of AI algorithms in many European hospitals. During this period, she has led the Healthcare group at the Big Data Value Association (BDVA). Francesca's research is reported in numerous international peer-reviewed scientific journals and top international conference proceedings in the field of computer vision, image-guided interventions, and AI privacy preserving techniques.</p><p><b>Tim Hulsen</b>: Writing—original draft; writing—review and editing. <b>Francesca Manni</b>: Writing—review and editing.</p><p>The authors declare no conflict of interest.</p>\",\"PeriodicalId\":37474,\"journal\":{\"name\":\"Healthcare Technology Letters\",\"volume\":\"11 4\",\"pages\":\"207-209\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294927/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare Technology Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/htl2.12086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/htl2.12086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

摘要

大数据指的是可以利用数据科学、统计学或机器学习(ML)进行挖掘和分析的大型数据集,通常无需预先定义假设[1]。人工智能(AI)是指机器对人类智能过程的模拟,机器可以利用这些大数据发现模式、进行预测,甚至生成新的数据或信息[2]。多年来,大数据已被用于改善医疗保健[3]和医学[1],使研究人员和医疗专业人员能够从大量丰富的数据集而不是基于少数患者的临床试验中得出结论。最近,人工智能也被用于医疗保健领域,例如在磁共振成像(MRI)中发现肿瘤并对其进行分类[4],或改进临床工作流程并使其自动化[5]。随着新模型和新技术的引入,人工智能在医疗保健领域的应用仍在不断增加。例如,大型语言模型(LLM)(如 ChatGPT)的创建使生成式人工智能(GenAI)得以在医疗保健领域使用[6]。GenAI 可用于创建合成数据(在原始数据存在隐私问题的情况下)、生成放射学或病理学报告,或创建与患者互动的聊天机器人。预计人工智能在医疗保健领域的应用将变得更加重要,因为医院正面临人员短缺和需要护理的老年人越来越多的问题。人工智能在医疗保健领域的兴起也带来了一些挑战。特别是在医疗保健领域,我们希望知道人工智能算法在做什么;它不应该是一个 "黑盒子"。可解释的人工智能(XAI)可以帮助医疗专业人员(甚至病人)理解人工智能算法做出某种决定的原因,从而增加对结果或预测的信任[7]。同样重要的是,人工智能必须遵守隐私法,不带偏见,不会产生有毒语言(如果是医疗聊天机器人)。负责任的人工智能(RAI)试图通过提供一个道德原则框架来预防这些问题[8]。通过拥抱人工智能提供的(当前和未来的)技术可能性,同时确保人工智能是可解释的、负责任的,我们就能确保医院能够经受住未来的任何挑战。一篇论文是关于提高机器学习模型的透明度,一篇是关于心脏病风险预测,另一篇是关于在罗马乌尔都语社交媒体帖子中检测抑郁症。其他论文涉及使用面部图像检测自闭症谱系障碍、组织病理学高光谱图像的混合脑肿瘤分类,以及在重症监护室(ICU)的整个持续时间内使用有创和无创通气的预测。葡京娱乐场官网在《打开你的黑盒分类器》[9]中讨论说,当机器学习(ML)模型应用于高风险应用时,其透明度是良好实践的核心。最近的发展使表格数据(Excel、CSV 等)的透明度成为可能,而表格数据在包括医疗保健在内的多个领域的风险建模和基于计算机的决策支持中非常普遍。作者概述了可解释性的重要激励因素,并总结了实用方法,指出了可用的主要方法。在 "使用机器学习算法进行心脏病风险预测"[10]一文中,Stonier 等人试图通过分析各种数据源,包括电子健康记录(EHR)和医院诊所的临床诊断报告,创建一个用于预测患者是否可能患心脏病的 ML 系统。对射频、回归模型、K-近邻(KNN)、奈夫贝叶斯算法等各种算法进行了比较。Rehmani 等人在《利用结构和非结构双语言的机器学习进行抑郁检测》[11] 一文中认为,抑郁是一种痛苦而严重的精神状态,会对人的思想、感觉和行动产生负面影响。他们的研究旨在创建一个罗马乌尔都语社交媒体帖子数据集,以预测罗马乌尔都语和英语的抑郁风险。对于罗马乌尔都语,英语数据来自 Facebook,并经过人工转换成罗马乌尔都语。英语评论来自 Kaggle。研究的机器学习模型包括支持向量机(SVM)、支持向量机径向基函数(SVM RBF)、随机森林(RF)和 BERT。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Guest Editorial: Big data and artificial intelligence in healthcare

Big data refers to large datasets that can be mined and analysed using data science, statistics or machine learning (ML), often without defining a hypothesis upfront [1]. Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, which can use these big data to find patterns, to make predictions and even to generate new data or information [2]. Big data has been used to improve healthcare [3] and medicine [1] already for many years, by enabling researchers and medical professionals to draw conclusions from large and rich datasets rather than from clinical trials based on a small number of patients. More recently, AI has been used in healthcare as well, for example by finding and classifying tumours in magnetic resonance images (MRI) [4] or by improving and automating the clinical workflow [5]. This uptake of AI in healthcare is still increasing, as new models and techniques are being introduced. For example, the creation of large language models (LLMs) such as ChatGPT enables the use of generative AI (GenAI) in healthcare [6]. GenAI can be used to create synthetic data (where the original data has privacy issues), generate radiology or pathology reports, or create chatbots to interact with the patient. The expectation is that the application of AI in healthcare will get even more important, as hospitals are suffering from personnel shortages and increasing numbers of elderly people needing care. The rise of AI in healthcare also comes with some challenges. Especially in healthcare, we want to know what the AI algorithm is doing; it should not be a ‘black box’. Explainable AI (XAI) can help the medical professional (or even the patient) to understand why the AI algorithm makes a certain decision, increasing trust in the result or prediction [7]. It is also important that AI works according to privacy laws, is free from bias, and does not produce toxic language (in case of a medical chatbot). Responsible AI (RAI) tries to prevent these issues by providing a framework of ethical principles [8]. By embracing the (current and future) technical possibilities AI has to offer, and at the same time making sure that AI is explainable and responsible, we can make sure that hospitals are able to withstand any future challenges.

This Special Issue contains six papers, all of which underwent peer review. One paper is about increasing the transparency of machine learning models, one is about cardiac disease risk prediction, and another one is about depression detection in Roman Urdu social media posts. The other papers are about autism spectrum disorder detection using facial images, hybrid brain tumour classification of histopathology hyperspectral images, and prediction of the utilization of invasive and non-invasive ventilation throughout the intensive care unit (ICU) duration.

Lisboa discusses in ‘Open your black box classifier’ [9] that the transparency of machine learning (ML) models is central to good practice when they are applied in high-risk applications. Recent developments make this feasible for tabular data (Excel, CSV etc.), which is prevalent in risk modelling and computer-based decision support across multiple domains including healthcare. The author outlines important motivating factors for interpretability and summarizes practical approaches, pointing out the main methods available. The main finding is that any black box classifier making probabilistic predictions of class membership from data in tabular form can be represented with a globally interpretable model without performance loss.

In ‘Cardiac Disease Risk Prediction using Machine Learning Algorithms’ [10], Stonier et al. try to create a ML system that is used for predicting whether a patient is likely to develop heart attacks, by analyzing various data sources including electronic health records (EHR) and clinical diagnosis reports from hospital clinics. Various algorithms such as RF, regression models, K-nearest neighbour (KNN), Naïve Bayes algorithm etc., are compared. Their RF algorithm provides a high accuracy (88.52%) in forecasting heart attack risk, which could herald a revolution in the diagnosis and treatment of cardiovascular illnesses.

Rehmani et al. argue in ‘Depression Detection with Machine Learning of Structural and Non-Structural Dual Languages’ [11] that depression is a painful and serious mental state, which has an adversarial impact on human thoughts, feeling, and actions. Their study aims to create a dataset of social media posts in the Roman Urdu language, to predict the risk of depression in Roman Urdu as well as English. For Roman Urdu, English language data has been obtained from Facebook, which was manually converted into Roman Urdu. English comments were obtained from Kaggle. Machine learning models, including Support Vector Machine (SVM), Support Vector Machine Radial Basis Function (SVM RBF), Random Forest (RF), and BERT, were investigated. The risk of depression was classified into three categories: not depressed, moderate depression, and severe depression. Out of these four models, SVM achieved the best result with an accuracy of 84%. Their work refines the area of depression prediction, particularly in Asian countries.

‘Autism Spectrum Disorder Detection using Facial Images: A Performance Comparison of Pretrained Convolutional Neural Networks’ [12] by Ahmad et al. discusses that studies have shown that early detection of ASD can assist in maintaining the behavioural and psychological development of children. Experts are currently studying various ML methods, particularly CNNs, to expedite the screening process. CNNs are considered promising frameworks for the diagnosis of ASD. Different pre-trained CNNs such as ResNet34, ResNet50, AlexNet, MobileNetV2, VGG16, and VGG19 were employed to diagnose ASD, and their performance was compared. The authors applied transfer learning to every model included in the study to achieve higher results than the initial models. The proposed ResNet50 model achieved the highest accuracy of 92%. The proposed method also outperformed the state-of-the-art models in terms of accuracy and computational cost.

Cruz-Guerrero et al. discuss in ‘Hybrid Brain Tumor Classification of Histopathology Hyperspectral Images by Linear Unmixing and an Ensemble of Deep Neural Networks’ [13] that hyperspectral imaging (HSI) has demonstrated its potential to provide correlated spatial and spectral information of a sample by a non-contact and non-invasive technology. In the medical field, especially in histopathology, HSI has been applied for the classification and identification of diseased tissue and for the characterization of its morphological properties. The authors propose a hybrid scheme to classify non-tumour and tumour histological brain samples by HSI. The proposed approach is based on the identification of characteristic components in a hyperspectral image by linear unmixing, as a features engineering step, and the subsequent classification by a deep learning approach. For this last step, an ensemble of deep neural networks is evaluated by a cross-validation scheme on an augmented dataset and a transfer learning scheme. The proposed method can classify histological brain samples with an average accuracy of 88%, and reduced variability, computational cost, and inference times, which presents an advantage over methods in the state-of-the-art. Therefore, their work demonstrates the potential of hybrid classification methodologies to achieve robust and reliable results by combining linear unmixing for features extraction and deep learning for classification.

Finally, in ‘Machine learning modeling for predicting the utilization of invasive and non-invasive ventilation throughout the ICU duration’ [14], Schwager et al. present a machine learning model to predict the need for both invasive and non-invasive mechanical ventilation in ICU patients. Using the Philips eICU Research Institute (ERI) database, 2.6 million ICU patient data from 2010 to 2019 were analyzed. Additionally, an external test set from a single hospital from this database was used to assess the model's generalizability. Model performance was determined by comparing the model probability predictions with the actual incidence of ventilation use, either invasive or non-invasive. The model demonstrated a prediction performance with an AUC of 0.921 for overall ventilation, 0.937 for invasive, and 0.827 for non-invasive. Factors such as high Glasgow Coma Scores, younger age, lower body mass index (BMI), and lower partial pressure of carbon dioxide (PaCO2) were highlighted as indicators of a lower likelihood for the need for ventilation. The model can serve as a retrospective benchmarking tool for hospitals to assess ICU performance concerning mechanical ventilation necessity. It also enables analysis of ventilation strategy trends and risk-adjusted comparisons, with potential for future testing as a clinical decision tool for optimizing ICU ventilation management.

Tim Hulsen is a Senior Data & AI Scientist with a broad experience in both academia and industry, working on a wide range of projects, mostly in oncology. After receiving his MSc in biology in 2001, he obtained a PhD in bioinformatics in 2007 from a collaboration between the Radboud University Nijmegen and the pharma company N.V. Organon. After 2 years post-doc at the Radboud University Nijmegen, he moved to Philips Research in 2009, where he worked on biomarker discovery for 1 year, before moving to the data management and data science field, working on big data projects in oncology, such as Prostate Cancer Molecular Medicine (PCMM), Translational Research IT (TraIT), Movember Global Action Plan 3 (GAP3), the European Randomized Study of Screening for Prostate Cancer (ERSPC), and Liquid Biopsies and Imaging (LIMA). His most recent projects are ReIMAGINE, which is about the use of imaging to prevent unnecessary biopsies in prostate cancer, and SMART-BEAR, which is about the development of an innovative platform to support the healthy and independent living of elderly people. He is the author of several publications around big data, data management, data science, and artificial intelligence in the context of healthcare and medicine.

Francesca Manni is currently a Clinical Scientist at Philips and guest researcher at Eindhoven University of Technology. Francesca has a background in biomedical engineering and computer vision with a focus on AI for medical imaging, by finalizing a PhD at Eindhoven University of Technology, where she has worked in close collaboration with leading EU hospitals. She focused on the development and application of novel imaging/sensing technologies such as hyperspectral imaging, to build specific solutions for tumour detection and minimally invasive surgery. Her dissertation work resulted in applying novel algorithms for patient tracking during spinal surgery and cancer detection. After that, she was AI & Data Scientist at Philips Research from 2021 to 2023. She has worked within the AI for vision field, enabling AI solutions in healthcare as well as for the deployment of AI algorithms in many European hospitals. During this period, she has led the Healthcare group at the Big Data Value Association (BDVA). Francesca's research is reported in numerous international peer-reviewed scientific journals and top international conference proceedings in the field of computer vision, image-guided interventions, and AI privacy preserving techniques.

Tim Hulsen: Writing—original draft; writing—review and editing. Francesca Manni: Writing—review and editing.

The authors declare no conflict of interest.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Healthcare Technology Letters
Healthcare Technology Letters Health Professions-Health Information Management
CiteScore
6.10
自引率
4.80%
发文量
12
审稿时长
22 weeks
期刊介绍: Healthcare Technology Letters aims to bring together an audience of biomedical and electrical engineers, physical and computer scientists, and mathematicians to enable the exchange of the latest ideas and advances through rapid online publication of original healthcare technology research. Major themes of the journal include (but are not limited to): Major technological/methodological areas: Biomedical signal processing Biomedical imaging and image processing Bioinstrumentation (sensors, wearable technologies, etc) Biomedical informatics Major application areas: Cardiovascular and respiratory systems engineering Neural engineering, neuromuscular systems Rehabilitation engineering Bio-robotics, surgical planning and biomechanics Therapeutic and diagnostic systems, devices and technologies Clinical engineering Healthcare information systems, telemedicine, mHealth.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信