Refika Sultan Doğan , Ebru Akay , Serkan Doğan , Bülent Yılmaz
{"title":"Hyperplastic and tubular polyp classification using machine learning and feature selection","authors":"Refika Sultan Doğan , Ebru Akay , Serkan Doğan , Bülent Yılmaz","doi":"10.1016/j.ibmed.2024.100177","DOIUrl":"10.1016/j.ibmed.2024.100177","url":null,"abstract":"<div><h3>Purpose</h3><div>The aim of this study is to develop an effective approach for differentiating between hyperplastic and tubular adenoma colon polyps, which is one of the most difficult tasks in colonoscopy procedures. The main research challenge is how to improve the classification of these polyp subtypes applying various focusing levels on the polyp images, data preprocessing approaches, and classification algorithms.</div></div><div><h3>Methods</h3><div>This study employed 202 colonoscopy videos from a total of 201 patients, focusing on 59 videos containing hyperplastic and tubular adenoma polyps. Manually extract key frames and several feature extraction and classification techniques were applied. The influence of different datasets with various focuses as well as data preprocessing steps on the performance of classification was examined, and AUC values were calculated using ten classifiers.</div></div><div><h3>Results</h3><div>The study discovered that the optimal dataset, data preprocessing method, and classification algorithm all had significant effects on classification results. The Random Forest model with the Recursive Feature Elimination (RFE) feature selection approach, for example, consistently outperformed other models and achieved the highest AUC value of 0.9067. In terms of accuracy, F1 score, recall, and AUC, the suggested model outperformed a gastroenterologist, nevertheless precision remained slightly lower.</div></div><div><h3>Conclusion</h3><div>This study emphasizes the importance of dataset selection, data preprocessing, and feature selection in enhancing the classification of difficult colon polyp subtypes. The suggested model offers a promising model for the clinical differentiation of hyperplastic and tubular adenoma polyps, potentially improving diagnostic accuracy in gastroenterology.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100177"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
George Leifman , Tomer Golany , Ehud Rivlin , Wisam Khoury , Ahmad Assalia , Petachia Reissman
{"title":"Real-time artificial intelligence validation of critical view of safety in laparoscopic cholecystectomy","authors":"George Leifman , Tomer Golany , Ehud Rivlin , Wisam Khoury , Ahmad Assalia , Petachia Reissman","doi":"10.1016/j.ibmed.2024.100153","DOIUrl":"https://doi.org/10.1016/j.ibmed.2024.100153","url":null,"abstract":"<div><h3>Background</h3><p>Critical View of Safety (CVS) is the accepted strategy to avoid bile duct injury during Laparoscopic Cholecystectomy (LC). In this study, we sought to investigate the accuracy and performance of a trained Artificial Intelligent (AI) model in validation of the CVS achievement during elective LC in a real time operating room setting.</p></div><div><h3>Study design</h3><p>A deep learning neural network which was previously trained on annotated segments of 700 LC videos to identify the CVS criteria, was integrated into the operating room laparoscopic video system, for continuous monitoring and real-time validation of CVS achievement during elective LC procedures. The system's feedback and surgeon's report were recorded and compared, as well as the overall rate of CVS achievement.</p></div><div><h3>Results</h3><p>Of 40 consecutive LC, CVS was reported by the surgeons in 34 (85 %). In all the 6 cases where CVS was not achieved due to severe inflammation or anatomy distortion, the AI model agreed with surgeon's report and did not identify CVS. Out of the 34 cases where CVS was achieved, the AI model identified 33. Thus, the AI model detected the CVS achievement with a specificity of 100 % [95%-CI 98.1 %, 100 %] and sensitivity of 97 % [95%-CI 96.1 %, 98.2 %].</p></div><div><h3>Conclusions</h3><p>A trained AI model can identify CVS during elective LC with very high accuracy in a real time OR setting. Additionally, its use may result in high rates of CVS achievement, thereby improving LC procedure's safety and outcome.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100153"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000206/pdfft?md5=f07707d889089060ef7b66be4c734e24&pid=1-s2.0-S2666521224000206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141605524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathan Phelps , Stephanie Marrocco , Stephanie Cornell , Dalton L. Wolfe , Daniel J. Lizotte
{"title":"Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitation","authors":"Nathan Phelps , Stephanie Marrocco , Stephanie Cornell , Dalton L. Wolfe , Daniel J. Lizotte","doi":"10.1016/j.ibmed.2024.100137","DOIUrl":"https://doi.org/10.1016/j.ibmed.2024.100137","url":null,"abstract":"<div><p>Reinforcement learning (RL) has helped improve decision-making in several domains but can be challenging to apply; this is the case for rehabilitation of people with a spinal cord injury (SCI). Among other factors, applying RL in this domain is difficult because there are many possible treatments (i.e., large action space) and few detailed records of longitudinal treatments and outcomes (i.e., limited training data). Applying Fitted Q Iteration in this domain with linear models and the most natural state and action representation results in problems with convergence and overfitting. However, isolating treatments from one another can mitigate the convergence issue, and treatments for SCIs have meaningful groupings that can be used to combat overfitting. We propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using an embedding technique. After re-interpreting the data using these treatment grouping approaches in conjunction with our process that isolates the treatment groups, we use Fitted Q Iteration to train an agent that learns to select better treatments. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that agents trained after using either grouping method can help improve the treatment decisions of individual physiotherapists, but the approach based on domain knowledge offers better performance. Our findings provide a proof of concept that applying RL has the potential to help improve the treatment of those with an SCI and indicates that continued efforts to gather data and apply RL to this domain are worthwhile.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"9 ","pages":"Article 100137"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000048/pdfft?md5=0e9b4fe44a6fce7ea5f3e30e6224f595&pid=1-s2.0-S2666521224000048-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thais Maria Santos Bezerra , Matheus Silva de Deus , Felipe Cavalaro , Denise Ribeiro , Ana Luiza Seidinger , Izilda Aparecida Cardinalli , Andreia de Melo Porcari , Luciano de Souza Queiroz , Helio Pedrini , Joao Meidanis
{"title":"Deep learning outperforms classical machine learning methods in pediatric brain tumor classification through mass spectra","authors":"Thais Maria Santos Bezerra , Matheus Silva de Deus , Felipe Cavalaro , Denise Ribeiro , Ana Luiza Seidinger , Izilda Aparecida Cardinalli , Andreia de Melo Porcari , Luciano de Souza Queiroz , Helio Pedrini , Joao Meidanis","doi":"10.1016/j.ibmed.2024.100178","DOIUrl":"10.1016/j.ibmed.2024.100178","url":null,"abstract":"<div><div>Pediatric brain tumors are the most common cause of death among all childhood cancers and surgical resection usually is the first step in disease management. During surgery, it is important to perform safe gross resection of tumors, retaining as much brain tissue as possible. Therefore, appropriate resection margin delineation is extremely relevant.</div><div>Currently available methods for tissue analysis have limited precision, are time-consuming, and often require multiple invasive procedures. Our main goal is to test whether machine learning techniques are capable of classifying the pediatric brain tissue chemical profile generated by DESI-MSI, which is mainly lipidic, into normal or abnormal tissue and into low- and high-grade malignancy subareas within each sample.</div><div>Our experiments show that deep learning methods outperform classical machine learning methods in the task of classifying brain tissue from DESI-MSI mass spectra, both in normal versus abnormal tissue, and, for malignant tissues, in low-grade versus high-grade malignancy.</div><div>Our conclusion are based on the analysis of 34,870 annotated spectra, obtained from the neoplastic and non-neoplastic microanatomical stratification of individual samples from 116 pediatric patients who underwent brain tumor surgical resection at the Boldrini Children’s Center between 2000 and 2020. Support Vector Machines, Random, Forests, and Least Absolute Shrinkage and Selection Operator (LASSO) were among the classical machine learning techniques evaluated.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100178"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhen Yu Gordon Ko , Yang Li , Jiulong Liu , Hui Ji , Anqi Qiu , Nanguang Chen
{"title":"DOTnet 2.0: Deep learning network for diffuse optical tomography image reconstruction","authors":"Zhen Yu Gordon Ko , Yang Li , Jiulong Liu , Hui Ji , Anqi Qiu , Nanguang Chen","doi":"10.1016/j.ibmed.2023.100133","DOIUrl":"https://doi.org/10.1016/j.ibmed.2023.100133","url":null,"abstract":"<div><p>Breast cancer is the most common cancer worldwide. The standard imaging modality for breast cancer screening is X-ray mammography, which suffers from low sensitivities in women with dense breasts and can potentially cause cancers despite a low radiation dosage. Diffuse Optical Tomography (DOT) is a noninvasive imaging technique that can potentially be employed to improve breast cancer early detection. However, conventional model-based algorithms for reconstructing DOT images usually produce low-quality images with limited resolution and low reconstruction accuracy. We propose to integrate deep neural networks (DNNs) with the conventional DOT reconstruction methods. This hybrid framework significantly enhances image quality. The DNNs have been trained and tested with sample data derived from clinically relevant breast models. The sample dataset contains blood vessel structures from breast structures and artificially created vessels using the Lindenmayer-system algorithm. By comparing the hybrid reconstruction with the ground truth image, we demonstrated a multi scale - structural similarity index measure (MS-SSIM) score of 0.80–0.90. Whereas using conventional reconstruction, MS-SSIM provided a much inferior score of 0.36–0.59. In terms of DOT image quality, both qualitative and quantitative assessments of the reconstructed images signify that the hybrid approach is superior to conventional methods. This improvement suggests that DOT can potentially become a viable alternative to breast cancer screening, providing a step towards the next-generation device for optical mammography.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"9 ","pages":"Article 100133"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521223000479/pdfft?md5=b2e58d94df5991666cbcf475e94e18db&pid=1-s2.0-S2666521223000479-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139748945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lisa Reinhart , Anne C. Bischops , Janna-Lina Kerth , Maurus Hagemeister , Bert Heinrichs , Simon B. Eickhoff , Juergen Dukart , Kerstin Konrad , Ertan Mayatepek , Thomas Meissner
{"title":"Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance","authors":"Lisa Reinhart , Anne C. Bischops , Janna-Lina Kerth , Maurus Hagemeister , Bert Heinrichs , Simon B. Eickhoff , Juergen Dukart , Kerstin Konrad , Ertan Mayatepek , Thomas Meissner","doi":"10.1016/j.ibmed.2024.100134","DOIUrl":"10.1016/j.ibmed.2024.100134","url":null,"abstract":"<div><h3>Objectives</h3><p>Recent advances in Artificial Intelligence (AI) offer promising opportunities for its use in pediatric healthcare. This is especially true for early identification of developmental problems where timely intervention is essential, but developmental assessments are resource-intensive. AI carries potential as a valuable tool in the early detection of such developmental issues. In this systematic review, we aim to synthesize and evaluate the current literature on AI-usage in monitoring child development, including possible clinical outcomes, and acceptability of such technologies by different stakeholders.</p></div><div><h3>Material and methods</h3><p>The systematic review is based on a literature search comprising the databases PubMed, Cochrane Library, Scopus, Web of Science, Science Direct, PsycInfo, ACM and Google Scholar (time interval 1996–2022). All articles addressing AI-usage in monitoring child development or describing respective clinical outcomes and opinions were included.</p></div><div><h3>Results</h3><p>Out of 2814 identified articles, finally 71 were included. 70 reported on AI usage and one study dealt with users’ acceptance of AI. No article reported on potential clinical outcomes of AI applications. Articles showed a peak from 2020 to 2022. The majority of studies were from the US, China and India (n = 45) and mostly used pre-existing datasets such as electronic health records or speech and video recordings. The most used AI methods were support vector machines and deep learning.</p></div><div><h3>Conclusion</h3><p>A few well-proven AI applications in developmental monitoring exist. However, the majority has not been evaluated in clinical practice. The subdomains of cognitive, social and language development are particularly well-represented. Another focus is on early detection of autism. Potential clinical outcomes of AI usage and user's acceptance have rarely been considered yet. While the increase of publications in recent years suggests an increasing interest in AI implementation in child development monitoring, future research should focus on clinical practice application and stakeholder's needs.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"9 ","pages":"Article 100134"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000012/pdfft?md5=069d33a41736fe9c351d51eab8c166bf&pid=1-s2.0-S2666521224000012-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139877435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine S. Adcock , Gabriel Byczynski , Emma Meade , Sook Ling Leong , Richard Gault , Hubert Lim , Sven Vanneste
{"title":"Feasibility of deep learning to predict tinnitus patient outcomes","authors":"Katherine S. Adcock , Gabriel Byczynski , Emma Meade , Sook Ling Leong , Richard Gault , Hubert Lim , Sven Vanneste","doi":"10.1016/j.ibmed.2024.100141","DOIUrl":"https://doi.org/10.1016/j.ibmed.2024.100141","url":null,"abstract":"<div><p>Advances in machine and deep learning techniques provide a novel approach in understanding complex patterns within large datasets, leading to an implementation of personalized medicine approaches to support clinical decision making. Results from recent clinical trials (TENT-A1 and TENT-A2 studies; clinicaltrials.gov: <span>NCT02669069</span><svg><path></path></svg> and <span>NCT03530306</span><svg><path></path></svg>) support that a novel bimodal neuromodulation approach could be a breakthrough treatment for patients with tinnitus, which adversely affects 10–15 % of the population. Given the heterogeneity of symptoms, it is important to identify whether treatment has an optimal effect on specific subgroups of tinnitus patients. The current study is a first look at the feasibility of using deep learning modelling on patient reported data to predict treatment outcomes in individuals with tinnitus, and highlights what features are most beneficial for clinical decision making.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"9 ","pages":"Article 100141"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000085/pdfft?md5=be723d4e20025718809aab06a9a42aa7&pid=1-s2.0-S2666521224000085-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141097355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of cardiovascular disease using explainable artificial intelligence and gut microbiota data","authors":"Can Duyar , Simone Oliver Senica , Habil Kalkan","doi":"10.1016/j.ibmed.2024.100180","DOIUrl":"10.1016/j.ibmed.2024.100180","url":null,"abstract":"<div><h3>Purpose:</h3><div>Gut microbiota are defined as the microbial population of the intestines. They include various types of bacteria which can influence and predict the existence or onset of some specific diseases. Therefore, it is a common practice in medicine to analyze the gut microbiota for diagnostic purposes by analyzing certain measurable biochemical features associated with the disease under investigation. However, the evaluation of all the data collected from the gut microbiota is a labor-intensive process. Artificial Intelligence (AI) may be a helpful tool to identify the hidden patterns in gut microbiota for the detection of disease and other classification problems.</div></div><div><h3>Methods:</h3><div>In this study, we propose a deep neural model based on a one-dimensional convolutional neural network (1D-CNN) to detect cardiovascular disease using bacterial taxonomy and OTU (Operational Taxonomic Unit) table data. The developed AI method is compared to classical machine learning algorithms, regression, boosting algorithms and a deep model, Tabular Network (TabNet), developed for tabular data and obtained outperforming classification results.</div></div><div><h3>Results:</h3><div>According to AUC (Area Under Curve) values, boosting and regression methods outperformed the classical machine learning methods. However, the highest value of 97.09 AUC was obtained with the developed 1D-CNN model by using bacterial taxonomy data even with less then expected number of samples. Using explainable AI, nine bacteria were identified which the models find important for classification.</div></div><div><h3>Conclusion:</h3><div>The proposed method is robust and well adapted to taxonomy data in tabular form. It can be easily adapted to detect other diseases by using taxonomy data. The study also investigated the effect on barcode sequence for the classification, but the result showed that barcode sequences do not contribute to the bacterial taxonomy data for the estimation of CVD disease.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100180"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of contactless human vital signs monitoring device with remote-photoplethysmography using adaptive region-of-interest and hybrid processing methods","authors":"Dessy Novita , Fajar Wira Adikusuma , Nanang Rohadi , Bambang Mukti Wibawa , Agus Trisanto , Irma Ruslina Defi , Sherllina Rizqi Fauziah","doi":"10.1016/j.ibmed.2024.100160","DOIUrl":"10.1016/j.ibmed.2024.100160","url":null,"abstract":"<div><p>Vital sign assessment is an examination that indicates changes in health. Direct contact during vital signs assessment can increase the risk of disease transmission. This research aimed to develop a contactless vital sign monitoring prototype that includes heart rate, respiratory rate, blood pressure, and oxygen saturation using a digital camera based on remote photoplethysmography with an adaptive region of interest. The adaptive region-of-interest method uses face detection and skin segmentation to generate red-green-blue signals, taking only the skin pixels of the patients while also minimising the effect of motion artefacts. The hybrid processing method combines several vital sign extraction methods to filter external irrelevant factors and produce heart rate, respiratory rate, blood pressure, and blood oxygen saturation values. In addition, the prototype was tested on 50 participants using standard vital sign assessment tools for comparison. The technical specification test of the prototype concluded that the optimal distance of this prototype was up to 2 m with a processing time of 2 s for every 1-s video. The vital signs results were presented using Bland-Altman, which showed that although the Bland-Altman plots revealed a substantial variance in the limits of agreement (±15–20 mmHg for blood pressure, ±15–17 bpm for heart rate, ±4–6 bpm for respiratory rate, and ±1–3 % for blood oxygen saturation), the mean differences for all vital signs were small (±0.7–5 mmHg for blood pressure, ±0.4–0.6 bpm for heart rate, ±0.5–0.7 bpm for respiratory rate, ±0.4–0.6 for blood oxygen saturation) and most data points were within the limits. While further clinical studies are needed to assess its reliability in monitoring specific medical conditions, the prototype has shown an acceptable agreement in assessing vital signs compared to the conventional methods, making it feasible for further development into a medical device.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100160"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000279/pdfft?md5=9c2a08467d4ad925fd1a09dfb6f59ae1&pid=1-s2.0-S2666521224000279-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141843363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating the prevalence of diabetic retinopathy in electronic health records with massive missing labels","authors":"Ye Liang , Ru Wang , Yuchen Wang , Tieming Liu","doi":"10.1016/j.ibmed.2024.100154","DOIUrl":"https://doi.org/10.1016/j.ibmed.2024.100154","url":null,"abstract":"<div><h3>Objective</h3><p>The paper aims to address the problem of massive unlabeled patients in electronic health records (EHR) who potentially have undiagnosed diabetic retinopathy (DR). It is desired to estimate the actual DR prevalence in EHR with 96 % missing labels.</p></div><div><h3>Materials and methods</h3><p>The Cerner Health Facts data are used in the study, with 3749 labeled DR patients and 97,876 unlabeled diabetic patients. This extensive dataset spans the demographics of the United States over the past two decades. We implemented state-of-art positive-unlabeled learning methods, including ensemble-based support vector machine, ensemble-based random forest, and Bayesian finite mixture modeling.</p></div><div><h3>Results</h3><p>The estimated DR prevalence in the population represented by Cerner EHR is approximately 25 % and the classification techniques generally achieve an AUC of around 87 %. As a by-product, a predictive inference on the risk of DR based on a patient's personalized medical information is derived.</p></div><div><h3>Discussion</h3><p>Missing labels is a common issue for EHR data quality. Ignoring these missing labels can lead to biased results in the analyses of EHR data. The problem is especially severe in the context of DR. It is thus important to use machine learning or statistical tools to identify the unlabeled patients. The tool in this paper helps both data analysts and clinicians in their practices.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100154"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000218/pdfft?md5=0b269311073371904a3317a4df15d0e5&pid=1-s2.0-S2666521224000218-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}