Xingru Huang, Changpeng Yue, Yihao Guo, Jian Huang, Zhengyao Jiang, Mingkuan Wang, Zhaoyang Xu, Guangyuan Zhang, Jin Liu, Tianyun Zhang, Zhiwen Zheng, Xiaoshuai Zhang, Hong He, Shaowei Jiang, Yaoqi Sun
{"title":"Multidimensional Directionality-Enhanced Segmentation via large vision model.","authors":"Xingru Huang, Changpeng Yue, Yihao Guo, Jian Huang, Zhengyao Jiang, Mingkuan Wang, Zhaoyang Xu, Guangyuan Zhang, Jin Liu, Tianyun Zhang, Zhiwen Zheng, Xiaoshuai Zhang, Hong He, Shaowei Jiang, Yaoqi Sun","doi":"10.1016/j.media.2024.103395","DOIUrl":"https://doi.org/10.1016/j.media.2024.103395","url":null,"abstract":"<p><p>Optical Coherence Tomography (OCT) facilitates a comprehensive examination of macular edema and associated lesions. Manual delineation of retinal fluid is labor-intensive and error-prone, necessitating an automated diagnostic and therapeutic planning mechanism. Conventional supervised learning models are hindered by dataset limitations, while Transformer-based large vision models exhibit challenges in medical image segmentation, particularly in detecting small, subtle lesions in OCT images. This paper introduces the Multidimensional Directionality-Enhanced Retinal Fluid Segmentation framework (MD-DERFS), which reduces the limitations inherent in conventional supervised models by adapting a transformer-based large vision model for macular edema segmentation. The proposed MD-DERFS introduces a Multi-Dimensional Feature Re-Encoder Unit (MFU) to augment the model's proficiency in recognizing specific textures and pathological features through directional prior extraction and an Edema Texture Mapping Unit (ETMU), a Cross-scale Directional Insight Network (CDIN) furnishes a holistic perspective spanning local to global details, mitigating the large vision model's deficiencies in capturing localized feature information. Additionally, the framework is augmented by a Harmonic Minutiae Segmentation Equilibrium loss (L<sub>HMSE</sub>) that can address the challenges of data imbalance and annotation scarcity in macular edema datasets. Empirical validation on the MacuScan-8k dataset shows that MD-DERFS surpasses existing segmentation methodologies, demonstrating its efficacy in adapting large vision models for boundary-sensitive medical imaging tasks. The code is publicly available at https://github.com/IMOP-lab/MD-DERFS-Pytorch.git.</p>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103395"},"PeriodicalIF":10.7,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142791576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weilu Li , Yun Zhang , Hao Zhou, Wenhan Yang, Zhi Xie, Yao He
{"title":"CLMS: Bridging domain gaps in medical imaging segmentation with source-free continual learning for robust knowledge transfer and adaptation","authors":"Weilu Li , Yun Zhang , Hao Zhou, Wenhan Yang, Zhi Xie, Yao He","doi":"10.1016/j.media.2024.103404","DOIUrl":"10.1016/j.media.2024.103404","url":null,"abstract":"<div><div>Deep learning shows promise for medical image segmentation but suffers performance declines when applied to diverse healthcare sites due to data discrepancies among the different sites. Translating deep learning models to new clinical environments is challenging, especially when the original source data used for training is unavailable due to privacy restrictions. Source-free domain adaptation (SFDA) aims to adapt models to new unlabeled target domains without requiring access to the original source data. However, existing SFDA methods face challenges such as error propagation, misalignment of visual and structural features, and inability to preserve source knowledge. This paper introduces Continual Learning Multi-Scale domain adaptation (CLMS), an end-to-end SFDA framework integrating multi-scale reconstruction, continual learning, and style alignment to bridge domain gaps across medical sites using only unlabeled target data or publicly available data. Compared to the current state-of-the-art methods, CLMS consistently and significantly achieved top performance for different tasks, including prostate MRI segmentation (improved Dice of 10.87 %), colonoscopy polyp segmentation (improved Dice of 17.73 %), and plus disease classification from retinal images (improved AUC of 11.19 %). Crucially, CLMS preserved source knowledge for all the tasks, avoiding catastrophic forgetting. CLMS demonstrates a promising solution for translating deep learning models to new clinical imaging domains towards safe, reliable deployment across diverse healthcare settings.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103404"},"PeriodicalIF":10.7,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142756236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiayue Chu , Chenhe Du , Xiyue Lin , Xiaoqun Zhang , Lihui Wang , Yuyao Zhang , Hongjiang Wei
{"title":"Highly accelerated MRI via implicit neural representation guided posterior sampling of diffusion models","authors":"Jiayue Chu , Chenhe Du , Xiyue Lin , Xiaoqun Zhang , Lihui Wang , Yuyao Zhang , Hongjiang Wei","doi":"10.1016/j.media.2024.103398","DOIUrl":"10.1016/j.media.2024.103398","url":null,"abstract":"<div><div>Reconstructing high-fidelity magnetic resonance (MR) images from under-sampled k-space is a commonly used strategy to reduce scan time. The posterior sampling of diffusion models based on the real measurement data holds significant promise of improved reconstruction accuracy. However, traditional posterior sampling methods often lack effective data consistency guidance, leading to inaccurate and unstable reconstructions. Implicit neural representation (INR) has emerged as a powerful paradigm for solving inverse problems by modeling a signal’s attributes as a continuous function of spatial coordinates. In this study, we present a novel posterior sampler for diffusion models using INR, named DiffINR. The INR-based component incorporates both the diffusion prior distribution and the MRI physical model to ensure high data fidelity. DiffINR demonstrates superior performance on in-distribution datasets with remarkable accuracy, even under high acceleration factors (up to R <span><math><mo>=</mo></math></span> 12 in single-channel reconstruction). Furthermore, DiffINR exhibits excellent generalizability across various tissue contrasts and anatomical structures with low uncertainty. Overall, DiffINR significantly improves MRI reconstruction in terms of accuracy, generalizability and stability, paving the way for further accelerating MRI acquisition. Notably, our proposed framework can be a generalizable framework to solve inverse problems in other medical imaging tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103398"},"PeriodicalIF":10.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huidong Xie , Liang Guo , Alexandre Velo , Zhao Liu , Qiong Liu , Xueqi Guo , Bo Zhou , Xiongchao Chen , Yu-Jung Tsai , Tianshun Miao , Menghua Xia , Yi-Hwa Liu , Ian S. Armstrong , Ge Wang , Richard E. Carson , Albert J. Sinusas , Chi Liu
{"title":"Noise-aware dynamic image denoising and positron range correction for Rubidium-82 cardiac PET imaging via self-supervision","authors":"Huidong Xie , Liang Guo , Alexandre Velo , Zhao Liu , Qiong Liu , Xueqi Guo , Bo Zhou , Xiongchao Chen , Yu-Jung Tsai , Tianshun Miao , Menghua Xia , Yi-Hwa Liu , Ian S. Armstrong , Ge Wang , Richard E. Carson , Albert J. Sinusas , Chi Liu","doi":"10.1016/j.media.2024.103391","DOIUrl":"10.1016/j.media.2024.103391","url":null,"abstract":"<div><div>Rubidium-82 (<span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>) is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span>, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric images. The noise levels also vary substantially in different dynamic frames due to radiotracer decay and short half-life. Existing denoising methods are not applicable for this task due to the lack of paired training inputs/labels and inability to generalize across varying noise levels. Second, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> emits high-energy positrons. Compared with other tracers such as <span><math><mrow><msup><mrow></mrow><mrow><mn>18</mn></mrow></msup><mtext>F</mtext></mrow></math></span>, <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> travels a longer distance before annihilation, which negatively affect image spatial resolution. Here, the goal of this study is to propose a self-supervised method for simultaneous (1) noise-aware dynamic image denoising and (2) positron range correction for <span><math><mrow><msup><mrow></mrow><mrow><mn>82</mn></mrow></msup><mtext>Rb</mtext></mrow></math></span> cardiac PET imaging. Tested on a series of PET scans from a cohort of normal volunteers, the proposed method produced images with superior visual quality. To demonstrate the improvement in image quantification, we compared image-derived input functions (IDIFs) with arterial input functions (AIFs) from continuous arterial blood samples. The IDIF derived from the proposed method led to lower AUC differences, decreasing from 11.09<span><math><mtext>%</mtext></math></span> to 7.58<span><math><mtext>%</mtext></math></span> on average, compared to the original dynamic frames. The proposed method also improved the quantification of myocardium blood flow (MBF), as validated against <span><math><mrow><msup><mrow></mrow><mrow><mn>15</mn></mrow></msup><mtext>O-water</mtext></mrow></math></span> scans, with mean MBF differences decreased from 0.43 to 0.09, compared to the original dynamic frames. We also conducted a generalizability experiment on 37 patient scans obtained from a different country using a different scanner. The presented method enhanced defect contrast and resulted in lower regional MBF in areas with perfusion defects. Lastly, comparison with other related methods is included to show the effectivenes","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103391"},"PeriodicalIF":10.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Yang , Lianxin Wang , Chen Lin , Jiacheng Wang , Liansheng Wang
{"title":"DDKG: A Dual Domain Knowledge Guidance strategy for localization and diagnosis of non-displaced femoral neck fractures","authors":"Jing Yang , Lianxin Wang , Chen Lin , Jiacheng Wang , Liansheng Wang","doi":"10.1016/j.media.2024.103393","DOIUrl":"10.1016/j.media.2024.103393","url":null,"abstract":"<div><div>X-ray is the primary tool for diagnosing fractures, crucial for determining their type, location, and severity. However, non-displaced femoral neck fractures (ND-FNF) can pose challenges in identification due to subtle cracks and complex anatomical structures. Most deep learning-based methods for diagnosing ND-FNF rely on cropped images, necessitating manual annotation of the hip location, which increases annotation costs. To address this challenge, we propose Dual Domain Knowledge Guidance (DDKG), which harnesses spatial and semantic domain knowledge to guide the model in acquiring robust representations of ND-FNF across the whole X-ray image. Specifically, DDKG comprises two key modules: the Spatial Aware Module (SAM) and the Semantic Coordination Module (SCM). SAM employs limited positional supervision to guide the model in focusing on the hip joint region and reducing background interference. SCM integrates information from radiological reports, utilizes prior knowledge from large language models to extract critical information related to ND-FNF, and guides the model to learn relevant visual representations. During inference, the model only requires the whole X-ray image for accurate diagnosis without additional information. The model was validated on datasets from four different centers, showing consistent accuracy and robustness. Codes and models are available at <span><span>https://github.com/Yjing07/DDKG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103393"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai-Ni Wang , Haolin Wang , Guang-Quan Zhou , Yangang Wang , Ling Yang , Yang Chen , Shuo Li
{"title":"TSdetector: Temporal–Spatial self-correction collaborative learning for colonoscopy video detection","authors":"Kai-Ni Wang , Haolin Wang , Guang-Quan Zhou , Yangang Wang , Ling Yang , Yang Chen , Shuo Li","doi":"10.1016/j.media.2024.103384","DOIUrl":"10.1016/j.media.2024.103384","url":null,"abstract":"<div><div>CNN-based object detection models that strike a balance between performance and speed have been gradually used in polyp detection tasks. Nevertheless, accurately locating polyps within complex colonoscopy video scenes remains challenging since existing methods ignore two key issues: intra-sequence distribution heterogeneity and precision-confidence discrepancy. To address these challenges, we propose a novel Temporal–Spatial self-correction detector (TSdetector), which first integrates temporal-level consistency learning and spatial-level reliability learning to detect objects continuously. Technically, we first propose a global temporal-aware convolution, assembling the preceding information to dynamically guide the current convolution kernel to focus on global features between sequences. In addition, we designed a hierarchical queue integration mechanism to combine multi-temporal features through a progressive accumulation manner, fully leveraging contextual consistency information together with retaining long-sequence-dependency features. Meanwhile, at the spatial level, we advance a position-aware clustering to explore the spatial relationships among candidate boxes for recalibrating prediction confidence adaptively, thus eliminating redundant bounding boxes efficiently. The experimental results on three publicly available polyp video dataset show that TSdetector achieves the highest polyp detection rate and outperforms other state-of-the-art methods. The code can be available at <span><span>https://github.com/soleilssss/TSdetector</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103384"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HiFi-Syn: Hierarchical granularity discrimination for high-fidelity synthesis of MR images with structure preservation","authors":"Ziqi Yu , Botao Zhao , Shengjie Zhang , Xiang Chen , Fuhua Yan , Jianfeng Feng , Tingying Peng , Xiao-Yong Zhang","doi":"10.1016/j.media.2024.103390","DOIUrl":"10.1016/j.media.2024.103390","url":null,"abstract":"<div><div>Synthesizing medical images while preserving their structural information is crucial in medical research. In such scenarios, the preservation of anatomical content becomes especially important. Although recent advances have been made by incorporating instance-level information to guide translation, these methods overlook the spatial coherence of structural-level representation and the anatomical invariance of content during translation. To address these issues, we introduce hierarchical granularity discrimination, which exploits various levels of semantic information present in medical images. Our strategy utilizes three levels of discrimination granularity: pixel-level discrimination using a Brain Memory Bank, structure-level discrimination on each brain structure with a re-weighting strategy to focus on hard samples, and global-level discrimination to ensure anatomical consistency during translation. The image translation performance of our strategy has been evaluated on three independent datasets (UK Biobank, IXI, and BraTS 2018), and it has outperformed state-of-the-art algorithms. Particularly, our model excels not only in synthesizing normal structures but also in handling abnormal (pathological) structures, such as brain tumors, despite the variations in contrast observed across different imaging modalities due to their pathological characteristics. The diagnostic value of synthesized MR images containing brain tumors has been evaluated by radiologists. This indicates that our model may offer an alternative solution in scenarios where specific MR modalities of patients are unavailable. Extensive experiments further demonstrate the versatility of our method, providing unique insights into medical image translation.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103390"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiyuan Wang , Shang Zhao , Zikang Xu , S. Kevin Zhou
{"title":"LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation","authors":"Qiyuan Wang , Shang Zhao , Zikang Xu , S. Kevin Zhou","doi":"10.1016/j.media.2024.103387","DOIUrl":"10.1016/j.media.2024.103387","url":null,"abstract":"<div><div>Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel <strong>LACOSTE</strong> model that exploits <strong>L</strong>ocation-<strong>A</strong>gnostic <strong>CO</strong>ntexts in <strong>S</strong>tereo and <strong>TE</strong>mporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103387"},"PeriodicalIF":10.7,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142648651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junyu Chen , Yihao Liu , Shuwen Wei , Zhangxing Bian , Shalini Subramanian , Aaron Carass , Jerry L. Prince , Yong Du
{"title":"A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond","authors":"Junyu Chen , Yihao Liu , Shuwen Wei , Zhangxing Bian , Shalini Subramanian , Aaron Carass , Jerry L. Prince , Yong Du","doi":"10.1016/j.media.2024.103385","DOIUrl":"10.1016/j.media.2024.103385","url":null,"abstract":"<div><div>Deep learning technologies have dramatically reshaped the field of medical image registration over the past decade. The initial developments, such as regression-based and U-Net-based networks, established the foundation for deep learning in image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, network architectures, and uncertainty estimation. These advancements have not only enriched the field of image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D–3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103385"},"PeriodicalIF":10.7,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald M. Summers
{"title":"Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: A data-driven approach for improved classification","authors":"Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald M. Summers","doi":"10.1016/j.media.2024.103383","DOIUrl":"10.1016/j.media.2024.103383","url":null,"abstract":"<div><div>In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports for dataset releases. However, there is still room for improvement in label quality. These labelers typically output only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. Supervised deep learning models have also been developed for report labeling but lack adaptability, similar to rule-based systems. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) to extract and enhance findings labels on CXR reports. MAPLEZ extracts not only binary labels indicating the presence or absence of a finding but also the location, severity, and radiologists’ uncertainty about the finding. Over eight abnormalities from five test sets, we show that our method can extract these annotations with an increase of 3.6 percentage points (pp) in macro F1 score for categorical presence annotations and more than 20 pp increase in F1 score for the location annotations over competing labelers. Additionally, using the combination of improved annotations and multi-type annotations in classification supervision in a dataset of limited-resolution CXRs, we demonstrate substantial advancements in proof-of-concept classification quality, with an increase of 1.1 pp in AUROC over models trained with annotations from the best alternative approach. We share code and annotations.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103383"},"PeriodicalIF":10.7,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}