Michael H Bernstein, Marly van Assen, Michael A Bruno, Elizabeth A Krupinski, Carlo De Cecco, Grayson L Baird
{"title":"Is a score enough? Pitfalls and solutions for AI severity scores.","authors":"Michael H Bernstein, Marly van Assen, Michael A Bruno, Elizabeth A Krupinski, Carlo De Cecco, Grayson L Baird","doi":"10.1186/s41747-025-00603-z","DOIUrl":"10.1186/s41747-025-00603-z","url":null,"abstract":"<p><p>Severity scores, which often refer to the likelihood or probability of a pathology, are commonly provided by artificial intelligence (AI) tools in radiology. However, little attention has been given to the use of these AI scores, and there is a lack of transparency into how they are generated. In this comment, we draw on key principles from psychological science and statistics to elucidate six human factors limitations of AI scores that undermine their utility: (1) variability across AI systems; (2) variability within AI systems; (3) variability between radiologists; (4) variability within radiologists; (5) unknown distribution of AI scores; and (6) perceptual challenges. We hypothesize that these limitations can be mitigated by providing the false discovery rate and false omission rate for each score as a threshold. We discuss how this hypothesis could be empirically tested. KEY POINTS: The radiologist-AI interaction has not been given sufficient attention. The utility of AI scores is limited by six key human factors limitations. We propose a hypothesis for how to mitigate these limitations by using false discovery rate and false omission rate.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"67"},"PeriodicalIF":3.7,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12259500/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144627359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Lindner, Adrian Konstantin Luyken, Chris Lappe, Oliver Stachs, Thoralf Niendorf, Matthias Lütgens, Stefan Polei, Brigitte Vollmar, Andreas Buettner, Sönke Langner, Marc-André Weber, Ebba Beller
{"title":"<sup>23</sup>Na MRI quantification of sodium content in porcine eyes after immersion in saltwater and freshwater en route to time in water estimation.","authors":"Tobias Lindner, Adrian Konstantin Luyken, Chris Lappe, Oliver Stachs, Thoralf Niendorf, Matthias Lütgens, Stefan Polei, Brigitte Vollmar, Andreas Buettner, Sönke Langner, Marc-André Weber, Ebba Beller","doi":"10.1186/s41747-025-00605-x","DOIUrl":"10.1186/s41747-025-00605-x","url":null,"abstract":"<p><strong>Background: </strong>Differentiation between saltwater and freshwater immersion as well as estimating the corpse's time in water can be challenging. We aimed to establish and examine the feasibility of a novel approach based on sodium magnetic resonance imaging (<sup>23</sup>Na MRI) of the eye to facilitate noninvasive sodium quantification.</p><p><strong>Methods: </strong>Enucleated porcine eyes were immersed in NaCl 0.9%, NaCl 3.0%, NaCl 5.85%, distilled water (DW) or lake water (LW) at different time intervals, followed by <sup>23</sup>Na 7-T MRI sodium quantification.</p><p><strong>Results: </strong>After 6 h of immersion, a significant difference in vitreous body (VB) sodium concentration was found for NaCl 5.85% versus DW or LW (p ≤ 0.019). After 24 and 48 h of immersion, a significant difference in VB sodium concentration was found for NaCl 5.85% versus DW, LW, NaCl 3.0% or NaCl 0.9%, as well as for NaCl 3.0% versus DW, LW or NaCl 0.9% (p ≤ 0.001). After 24 h of immersion, lens sodium concentration showed a significant difference for NaCl 5.85% versus DW, LW, NaCl 3.0% or NaCl 0.9% (p ≤ 0.009); after 48 h of immersion, for NaCl 5.85% versus DW, LW, NaCl 3.0% or NaCl 0.9% (p ≤ 0.001), as well as for NaCl 3.0% versus DW, LW or NaCl 0.9% (p ≤ 0.007). For VB, sodium concentration changes over immersion time, and exponential curves were fitted to the data.</p><p><strong>Conclusion: </strong>Using <sup>23</sup>Na MRI in ex vivo porcine eyes with different immersion times in various saltwater concentrations and freshwater equivalents allowed noninvasive quantification of VB and lens sodium concentrations.</p><p><strong>Relevance statement: </strong>Although not a substitute for autopsy, <sup>23</sup>Na MRI assessment of VB and lens sodium concentrations may provide biochemical support in suspected drowning, especially in cases where an internal examination of the body is not authorized or where objections to autopsy are upheld.</p><p><strong>Key points: </strong>Postmortem porcine eyes with different immersion times in saltwater and freshwater. Noninvasive quantification of vitreous body and lens sodium concentrations with <sup>23</sup>Na MRI. Exponential time course of vitreous body sodium concentration in saltwater and freshwater.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"66"},"PeriodicalIF":3.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12240899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144601796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In vivo photoacoustic tomography of porcine abdominal organs using Fabry-Pérot sensing integrated platform.","authors":"Damien Gasteau, Alexis Vrignaud, Arnaud Biallais, Fabrice Richard, Gilles Blancho, Julien Branchereau, Benoît Mesnard","doi":"10.1186/s41747-025-00601-1","DOIUrl":"10.1186/s41747-025-00601-1","url":null,"abstract":"<p><strong>Objective: </strong>To evaluate in vivo a fully integrated photoacoustic tomography imaging system based on Fabry-Pérot ultrasound sensing method applied on porcine abdominal organs. This approach could be used by surgeons during intraoperative clinical procedures.</p><p><strong>Methods: </strong>The photoacoustic imaging system was fully integrated into a single structure, and the detection technology was based on a Fabry-Pérot interferometer. The detection probe connected to the imaging system was applied directly to the organs of a male \"large white\" Sus scrofa pig weighing 80 kg, either manually or using a stand, with or without a gel interface. All experiments were performed in compliance with EU Directive 2010/63/EU on animal experimentation (APAFiS #31507).</p><p><strong>Results: </strong>All intraperitoneal and retroperitoneal organs were evaluated using photoacoustic imaging. The evaluation of both hollow and solid organs was successfully conducted with consistent three-dimensional image quality. We demonstrate the system's ability to image blood vessels with diameters ranging from several millimeters down to less than 100 µm. Macroscopic evaluation of the organs using photoacoustic tomography imaging did not reveal any damage or burns caused by the excitation laser.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first reported imaging session of abdominal organs in an in vivo porcine model, performed using a photoacoustic tomography system with Fabry-Pérot interferometer detection. We present a high-resolution photoacoustic tomography system that is closer to routine clinical translation, thanks to a fully integrated system.</p><p><strong>Relevance statement: </strong>Photoacoustic evaluation of organs using a fully integrated system could become a valuable tool for surgical teams for intraprocedural assessment of vascularization.</p><p><strong>Key points: </strong>Photoacoustic imaging visualizes blood vessels without contrast agents or ionizing radiation. Photoacoustic imaging systems detect blood vessels ranging from millimeters to 100 µm. Fully integrated photoacoustic imaging systems are autonomously operable by surgical teams.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"65"},"PeriodicalIF":3.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12241532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144601797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sara Bizzozero, Tito Bassani, Luca Maria Sconfienza, Carmelo Messina, Matteo Bonato, Cecilia Inzaghi, Federica Marmondi, Paola Cinque, Giuseppe Banfi, Stefano Borghi
{"title":"Gender difference in cross-sectional area and fat infiltration of thigh muscles in the elderly population on MRI: an AI-based analysis.","authors":"Sara Bizzozero, Tito Bassani, Luca Maria Sconfienza, Carmelo Messina, Matteo Bonato, Cecilia Inzaghi, Federica Marmondi, Paola Cinque, Giuseppe Banfi, Stefano Borghi","doi":"10.1186/s41747-025-00606-w","DOIUrl":"10.1186/s41747-025-00606-w","url":null,"abstract":"<p><strong>Background: </strong>Aging alters musculoskeletal structure and function, affecting muscle mass, composition, and strength, increasing the risk of falls and loss of independence in older adults. This study assessed cross-sectional area (CSA) and fat infiltration (FI) of six thigh muscles through a validated deep learning model. Gender differences and correlations between fat, muscle parameters, and age were also analyzed.</p><p><strong>Methods: </strong>We retrospectively analyzed 141 participants (67 females, 74 males) aged 52-82 years. Participants underwent magnetic resonance imaging (MRI) scans of the right thigh and dual-energy x-ray absorptiometry to determine appendicular skeletal muscle mass index (ASMMI) and body fat percentage (FAT%). A deep learning-based application was developed to automate the segmentation of six thigh muscle groups.</p><p><strong>Results: </strong>Deep learning model accuracy was evaluated using the \"intersection over union\" (IoU) metric, with average IoU values across muscle groups ranging from 0.84 to 0.99. Mean CSA was 10,766.9 mm² (females 8,892.6 mm², males 12,463.9 mm², p < 0.001). The mean FI value was 14.92% (females 17.42%, males 12.62%, p < 0.001). Males showed larger CSA and lower FI in all thigh muscles compared to females. Positive correlations were identified in females between the FI of posterior thigh muscle groups (biceps femoris, semimembranosus, and semitendinosus) and age (r or ρ = 0.35-0.48; p ≤ 0.004), while no significant correlations were observed between CSA, ASMMI, or FAT% and age.</p><p><strong>Conclusion: </strong>Deep learning accurately quantifies muscle CSA and FI, reducing analysis time and human error. Aging impacts on muscle composition and distribution and gender-specific assessments in older adults is needed.</p><p><strong>Relevance statement: </strong>Efficient deep learning-based MRI image segmentation to assess the composition of six thigh muscle groups in over 50 individuals revealed gender differences in thigh muscle CSA and FI. These findings have potential clinical applications in assessing muscle quality, decline, and frailty.</p><p><strong>Key points: </strong>Deep learning model enhanced MRI segmentation, providing high assessment accuracy. Significant gender differences in cross-sectional area and fat infiltration across all thigh muscles were observed. In females, fat infiltration of the posterior thigh muscles was positively correlated with age.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"64"},"PeriodicalIF":3.7,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12234423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144585073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masia Fahim, Elke Hattingen, Alina Jurcoane, Jan R Schüre, Svenja Klinsing, Julia Koepsell, Kolja Jahnke, Michael W Ronellenfitsch, Ulrich Pilatus, Maria J G T Vehreschild, Ralf Deichmann, Christophe T Arendt
{"title":"Impact on the microstructure of deep gray matter in unvaccinated patients after moderate-to-severe COVID-19: insights from MRI T1 mapping.","authors":"Masia Fahim, Elke Hattingen, Alina Jurcoane, Jan R Schüre, Svenja Klinsing, Julia Koepsell, Kolja Jahnke, Michael W Ronellenfitsch, Ulrich Pilatus, Maria J G T Vehreschild, Ralf Deichmann, Christophe T Arendt","doi":"10.1186/s41747-025-00598-7","DOIUrl":"10.1186/s41747-025-00598-7","url":null,"abstract":"<p><strong>Background: </strong>To determine changes in quantitative T1 relaxation times (qT1) in deep gray matter in patients recovered from coronavirus disease 2019 (COVID-19).</p><p><strong>Methods: </strong>Unvaccinated COVID-19 participants ≥ 3 months after seropositivity and age- and sex-matched controls were examined using 3-T magnetic resonance imaging. Bilateral measures of thalamus, pallidum, putamen, caudate and accumbens nuclei, and hippocampus were extracted from qT1 maps after automated segmentation. Baseline characteristics and results of tests assessing neurological functions (standardized exam), ability to smell (4-Item Pocket Smell Test), depression (Beck Depression Inventory-II), sleepiness (Epworth Sleepiness Scale), sleep quality (Pittsburgh Sleep Quality Index), health-related quality of life (EQ-5D), and cognitive performance (Montreal Cognitive Assessment) were evaluated.</p><p><strong>Results: </strong>One hundred forty-five subjects (median age, 46 years; 73 females) were included (11/2020-12/2021): 69 recovered after COVID-19 and 76 controls (age, p = 0.532; sex, p = 0.799), without significant differences in qT1 values overall (all p-values > 0.050). Subgroup analysis of participants aged ≥ 40 (age, p = 0.675; sex, p = 0.447) revealed higher qT1 values in previously hospitalized COVID-19 subjects (23/69) compared to controls (47/76) in left and right caudate nuclei (p = 0.009; p = 0.027), left accumbens nucleus (p = 0.017), right putamen (p = 0.041), and right hippocampus (p = 0.020). No correlations were found with macroscopic imaging findings, pre-existing conditions, time since COVID-19 diagnosis, inpatient treatment duration, or test results.</p><p><strong>Conclusion: </strong>T1 mapping revealed microstructural changes in striatal and hippocampal regions of unvaccinated individuals aged ≥ 40 who recovered from moderate-to-severe COVID-19 during the pre-Omicron era.</p><p><strong>Relevance statement: </strong>This study elucidates brain involvement following severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, underscoring the need for further longitudinal analyses to assess the potential reversibility, stability or deterioration of these findings.</p><p><strong>Key points: </strong>We hypothesized altered T1 relaxation times in deep gray matter after COVID-19. Unvaccinated participants ≥ 40 years exhibited higher striatal, hippocampal qT1 after moderate-to-severe COVID-19. No qT1 correlations were found with hospitalization duration, pre-existing conditions, or neuro-(psycho)logical tests.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"63"},"PeriodicalIF":3.7,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12228859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144567953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabio Massimo Ulivieri, Carmelo Messina, Francesco Maria Vitale, Luca Rinaudo, Enzo Grossi
{"title":"Artificial intelligence for predicting the risk of bone fragility fractures in osteoporosis.","authors":"Fabio Massimo Ulivieri, Carmelo Messina, Francesco Maria Vitale, Luca Rinaudo, Enzo Grossi","doi":"10.1186/s41747-025-00572-3","DOIUrl":"10.1186/s41747-025-00572-3","url":null,"abstract":"<p><p>Osteoporosis is widespread with a high incidence rate, resulting in fragility fractures which are a major contributor to mortality among the elderly. Artificial intelligence (AI), in particular artificial neural networks, appears to be useful in managing osteoporosis complexity, where bone mineral density usually reduces with aging, losing the pivotal role in decision-making regarding fracture prediction and treatment choice. Nevertheless, only some osteoporotic patients develop fragility fractures, and treatments often are not prescribed because of the high costs and poor patient adherence. AI can help clinicians to identify patients prone to fragility fractures who can benefit from preventive interventions. We describe herein the methodology issues underlying the potential advantages of introducing AI methods to support clinical decision-making in osteoporosis, being aware of challenges regarding data availability and quality, model interpretability, integration into clinical workflows, and validation of predictive accuracy. The fact that no AI fracture risk prediction software is still publicly available can be related to the fact that few high-quality datasets are available and that AI models, particularly deep learning approaches, often act as 'black boxes', making it difficult to understand how predictions are made. In addition, the effective implementation of predictive software has not reached sufficient integration with existing systems. RELEVANCE STATEMENT: With aging, bone mineral density may lose the pivotal role in osteoporosis decision-making regarding fracture prediction and treatment choice. In this scenario, AI, particularly artificial neural networks (ANNs), can be useful in supporting the clinical management of patients affected by osteoporosis. KEY POINTS: Osteoporosis is a complex disease with many interlinked clinical and radiological variables. Bone mineral density and other known indices do not allow optimal decision-making in patients affected by osteoporosis. ANN analysis can better discriminate osteoporotic patients particularly prone to fragility fractures and can predict future fractures.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"62"},"PeriodicalIF":3.7,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12187619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144486295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oona Rainio, Heidi Huhtanen, Jari-Pekka Vierula, Janne Nurminen, Jaakko Heikkinen, Mikko Nyman, Riku Klén, Jussi Hirvonen
{"title":"Deep learning detects retropharyngeal edema on MRI in patients with acute neck infections.","authors":"Oona Rainio, Heidi Huhtanen, Jari-Pekka Vierula, Janne Nurminen, Jaakko Heikkinen, Mikko Nyman, Riku Klén, Jussi Hirvonen","doi":"10.1186/s41747-025-00599-6","DOIUrl":"10.1186/s41747-025-00599-6","url":null,"abstract":"<p><strong>Background: </strong>In acute neck infections, magnetic resonance imaging (MRI) shows retropharyngeal edema (RPE), which is a prognostic imaging biomarker for a severe course of illness. This study aimed to develop a deep learning-based algorithm for the automated detection of RPE.</p><p><strong>Methods: </strong>We developed a deep neural network consisting of two parts using axial T2-weighted water-only Dixon MRI images from 479 patients with acute neck infections annotated by radiologists at both slice and patient levels. First, a convolutional neural network (CNN) classified individual slices; second, an algorithm classified patients based on a stack of slices. Model performance was compared with the radiologists' assessment as a reference standard. Accuracy, sensitivity, specificity, and area under receiver operating characteristic curve (AUROC) were calculated. The proposed CNN was compared with InceptionV3, and the patient-level classification algorithm was compared with traditional machine learning models.</p><p><strong>Results: </strong>Of the 479 patients, 244 (51%) were positive and 235 (49%) negative for RPE. Our model achieved accuracy, sensitivity, specificity, and AUROC of 94.6%, 83.3%, 96.2%, and 94.1% at the slice level, and 87.4%, 86.5%, 88.2%, and 94.8% at the patient level, respectively. The proposed CNN was faster than InceptionV3 but equally accurate. Our patient classification algorithm outperformed traditional machine learning models.</p><p><strong>Conclusion: </strong>A deep learning model, based on weakly annotated data and computationally manageable training, achieved high accuracy for automatically detecting RPE on MRI in patients with acute neck infections.</p><p><strong>Relevance statement: </strong>Our automated method for detecting relevant MRI findings was efficiently trained and might be easily deployed in practice to study clinical applicability. This approach might improve early detection of patients at high risk for a severe course of acute neck infections.</p><p><strong>Key points: </strong>Deep learning automatically detected retropharyngeal edema on MRI in acute neck infections. Areas under the receiver operating characteristic curve were 94.1% at the slice level and 94.8% at the patient level. The proposed convolutional neural network was lightweight and required only weakly annotated data.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"60"},"PeriodicalIF":3.7,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12179047/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Wihl, Enrike Rosenkranz, Severin Schramm, Cornelius Berberich, Michael Griessmair, Piotr Woźnicki, Francisco Pinto, Sebastian Ziegelmayer, Lisa C Adams, Keno K Bressem, Jan S Kirschke, Claus Zimmer, Benedikt Wiestler, Dennis Hedderich, Su Hwan Kim
{"title":"Data extraction from free-text stroke CT reports using GPT-4o and Llama-3.3-70B: the impact of annotation guidelines.","authors":"Jonas Wihl, Enrike Rosenkranz, Severin Schramm, Cornelius Berberich, Michael Griessmair, Piotr Woźnicki, Francisco Pinto, Sebastian Ziegelmayer, Lisa C Adams, Keno K Bressem, Jan S Kirschke, Claus Zimmer, Benedikt Wiestler, Dennis Hedderich, Su Hwan Kim","doi":"10.1186/s41747-025-00600-2","DOIUrl":"10.1186/s41747-025-00600-2","url":null,"abstract":"<p><strong>Background: </strong>To evaluate the impact of an annotation guideline on the performance of large language models (LLMs) in extracting data from stroke computed tomography (CT) reports.</p><p><strong>Methods: </strong>The performance of GPT-4o and Llama-3.3-70B in extracting ten imaging findings from stroke CT reports was assessed in two datasets from a single academic stroke center. Dataset A (n = 200) was a stratified cohort including various pathological findings, whereas dataset B (n = 100) was a consecutive cohort. Initially, an annotation guideline providing clear data extraction instructions was designed based on a review of cases with inter-annotator disagreements in dataset A. For each LLM, data extraction was performed under two conditions: with the annotation guideline included in the prompt and without it.</p><p><strong>Results: </strong>GPT-4o consistently demonstrated superior performance over Llama-3.3-70B under identical conditions, with micro-averaged precision ranging from 0.83 to 0.95 for GPT-4o and from 0.65 to 0.86 for Llama-3.3-70B. Across both models and both datasets, incorporating the annotation guideline into the LLM input resulted in higher precision rates, while recall rates largely remained stable. In dataset B, the precision of GPT-4o and Llama-3-70B improved from 0.83 to 0.95 and from 0.87 to 0.94, respectively. Overall classification performance with and without the annotation guideline was significantly different in five out of six conditions.</p><p><strong>Conclusion: </strong>GPT-4o and Llama-3.3-70B show promising performance in extracting imaging findings from stroke CT reports, although GPT-4o steadily outperformed Llama-3.3-70B. We also provide evidence that well-defined annotation guidelines can enhance LLM data extraction accuracy.</p><p><strong>Relevance statement: </strong>Annotation guidelines can improve the accuracy of LLMs in extracting findings from radiological reports, potentially optimizing data extraction for specific downstream applications.</p><p><strong>Key points: </strong>LLMs have utility in data extraction from radiology reports, but the role of annotation guidelines remains underexplored. Data extraction accuracy from stroke CT reports by GPT-4o and Llama-3.3-70B improved when well-defined annotation guidelines were incorporated into the model prompt. Well-defined annotation guidelines can improve the accuracy of LLMs in extracting imaging findings from radiological reports.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"61"},"PeriodicalIF":3.7,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12179022/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lukas Zbinden, Samuel Erb, Damiano Catucci, Lars Doorenbos, Leona Hulbert, Annalisa Berzigotti, Michael Brönimann, Lukas Ebner, Andreas Christe, Verena Carola Obmann, Raphael Sznitman, Adrian Thomas Huber
{"title":"Automated quantification of T1 and T2 relaxation times in liver mpMRI using deep learning: a sequence-adaptive approach.","authors":"Lukas Zbinden, Samuel Erb, Damiano Catucci, Lars Doorenbos, Leona Hulbert, Annalisa Berzigotti, Michael Brönimann, Lukas Ebner, Andreas Christe, Verena Carola Obmann, Raphael Sznitman, Adrian Thomas Huber","doi":"10.1186/s41747-025-00596-9","DOIUrl":"10.1186/s41747-025-00596-9","url":null,"abstract":"<p><strong>Objectives: </strong>To evaluate a deep learning sequence-adaptive liver multiparametric MRI (mpMRI) assessment with validation in different populations using total and segmental T1 and T2 relaxation time maps.</p><p><strong>Methods: </strong>A neural network was trained to label liver segmental parenchyma and its vessels on noncontrast T1-weighted gradient-echo Dixon in-phase acquisitions on 200 liver mpMRI examinations. Then, 120 unseen liver mpMRI examinations of patients with primary sclerosing cholangitis or healthy controls were assessed by coregistering the labels to noncontrast and contrast-enhanced T1 and T2 relaxation time maps for optimization and internal testing. The algorithm was externally tested in a segmental and total liver analysis of previously unseen 65 patients with biopsy-proven liver fibrosis and 25 healthy volunteers. Measured relaxation times were compared to manual measurements using intraclass correlation coefficient (ICC) and Wilcoxon test.</p><p><strong>Results: </strong>Comparison of manual and deep learning-generated segmental areas on different T1 and T2 maps was excellent for segmental (ICC = 0.95 ± 0.1; p < 0.001) and total liver assessment (0.97 ± 0.02, p < 0.001). The resulting median of the differences between automated and manual measurements among all testing populations and liver segments was 1.8 ms for noncontrast T1 (median 835 versus 842 ms), 2.0 ms for contrast-enhanced T1 (median 518 versus 519 ms), and 0.3 ms for T2 (median 37 versus 37 ms).</p><p><strong>Conclusion: </strong>Automated quantification of liver mpMRI is highly effective across different patient populations, offering excellent reliability for total and segmental T1 and T2 maps. Its scalable, sequence-adaptive design could foster comprehensive clinical decision-making.</p><p><strong>Relevance statement: </strong>The proposed automated, sequence-adaptive algorithm for total and segmental analysis of liver mpMRI may be co-registered to any combination of parametric sequences, enabling comprehensive quantitative analysis of liver mpMRI without sequence-specific training.</p><p><strong>Key points: </strong>A deep learning-based algorithm automatically quantified segmental T1 and T2 relaxation times in liver mpMRI. The two-step approach of segmentation and co-registration allowed to assess arbitrary sequences. The algorithm demonstrated high reliability with manual reader quantification. No additional sequence-specific training is required to assess other parametric sequences. The DL algorithm has the potential to enhance individual liver phenotyping.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"58"},"PeriodicalIF":3.7,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167186/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144295000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maurice M Heimer, Amra Cimic, Sandra Kloiber-Langhorst, Melissa J Antons, Jennifer Stueckl, Heidrun Hirner-Eppeneder, Wolfgang G Kunz, Olaf Dietrich, Jens Ricke, Felix L Herr, Clemens C Cyran
{"title":"Quantitative response assessment of combined immunotherapy in a murine melanoma model using multiparametric MRI.","authors":"Maurice M Heimer, Amra Cimic, Sandra Kloiber-Langhorst, Melissa J Antons, Jennifer Stueckl, Heidrun Hirner-Eppeneder, Wolfgang G Kunz, Olaf Dietrich, Jens Ricke, Felix L Herr, Clemens C Cyran","doi":"10.1186/s41747-025-00597-8","DOIUrl":"10.1186/s41747-025-00597-8","url":null,"abstract":"<p><strong>Background: </strong>We assessed immunotherapy response in a murine melanoma model using multiparametric magnetic resonance imaging (mpMRI) features with ex vivo immunohistochemical validation.</p><p><strong>Methods: </strong>Murine melanoma cells (B16-F10) were inoculated into the subcutaneous flank of n = 28 C57BL/6 mice (n = 14 therapy; n = 14 control). Baseline mpMRI was acquired on day 7 at 3 T. The immunotherapy group received three intraperitoneal injections of anti-PD-L1 and anti-CTLA-4 antibodies on days 7, 9, and 11 after inoculation. Controls received a volume equivalent placebo. Follow-up mpMRI was performed on day 12. We assessed tumor volume, diffusion-weighted imaging parameters, including the apparent diffusion coefficient (ADC), and dynamic-contrast-enhanced metrics, including plasma volume and plasma flow. Tumor-infiltrating lymphocytes (TIL; CD8+), cell proliferation (Ki-67), apoptosis (terminal deoxynucleotidyl transferase deoxyuridine triphosphate nick-end labeling, TUNEL), and microvascular density (CD31+) were assessed in a validation cohort of n = 24 animals for time-matched ex vivo validation.</p><p><strong>Results: </strong>An increase in tumor volume was observed in both groups (p ≤ 0.004) without difference at follow-up (p = 0.630). A lower ADC value was observed in the immunotherapy group at follow-up (p = 0.001). Immunohistochemistry revealed higher TUNEL values (p < 0.001) and CD8+ TILs (p = 0.048) following immunotherapy, as well as lower tumor cell Ki-67 values (p < 0.001) and microvascular density/CD31+ (p < 0.001).</p><p><strong>Conclusion: </strong>Lower tumor ADC, paired with higher intratumoral expression of CD8+ TIL, was observed five days after immunotherapy, suggestive of early immunological response. Ex vivo immunohistochemistry confirmed the antitumoral efficacy of immunotherapy.</p><p><strong>Relevance statement: </strong>Compared to tumor size, diffusion-weighted MRI demonstrated potential for early response assessment to immunotherapy in a murine melanoma model, which could reflect changes in the tumor microenvironment and immune cell infiltration.</p><p><strong>Key points: </strong>No difference in tumor volume was observed between groups before and after therapy. Lower ADC values paired with increased CD8+ TILs were observed following immunotherapy. Ex vivo immunohistochemistry confirmed antitumoral efficacy of anti-PD-L1 and anti-CTLA-4 immunotherapy.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"59"},"PeriodicalIF":3.7,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167185/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144295001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}