Dependence of observer task on conclusions drawn from in silico trials evaluating the performance of full-field digital mammography and digital breast tomosynthesis.
IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
{"title":"Dependence of observer task on conclusions drawn from <i>in silico</i> trials evaluating the performance of full-field digital mammography and digital breast tomosynthesis.","authors":"Dan Li, Andrey Makeev, Stephen J Glick","doi":"10.1117/1.JMI.12.S1.S13014","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>We aim to refine the task-based evaluation of full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) through <i>in silico</i> trials (ISTs). Previous ISTs mostly employ lesion detection tasks for task-based performance evaluation, which differ from clinical practice where the task normally involves the radiologists both detecting whether a suspicious lesion is present and rating how likely it is that the lesion is malignant. We hypothesize that differing conclusions may result from ISTs based on the defined task.</p><p><strong>Approach: </strong>The shape of the masses was employed as a surrogate indicator for malignancy, with spiculated masses representing malignant lesions and lobular masses representing benign lesions. A convolutional neural network (CNN) model observer was then trained to differentiate between spiculated and nonspiculated masses using Monte Carlo-simulated breast images. This approach leverages prior research demonstrating that CNN-based frameworks can approximate the performance of an ideal observer. We systematically evaluated the effects of varying dose levels, detector pixel size, and projection angular range on the CNN model observer's performance in both detection and classification tasks, assessing the performance of both FFDM and DBT systems.</p><p><strong>Results: </strong>Our findings demonstrate significant variations in conclusions drawn from IST models depending on whether the task is lesion detection or classification. Specifically, we observed that varying average glandular dose levels from 2.0 to 0.5 mGy had little effect on the detection of masses, whereas a small but significant decrease in performance with reduced dose was observed with the classification task across FFDM and DBT. Similarly, reduced spatial resolution resulted in a small but significant decrease in performance with the classification task for FFDM. For DBT ISTs, we also observed that the preferred angular range varies depending on whether the task is detection or classification.</p><p><strong>Conclusions: </strong>Integrating classification tasks into ISTs and potentially physical phantom studies can provide additional information in the evaluation of clinical breast imaging systems. This methodology can enhance the reliability of performance assessments for new breast imaging technologies. Depending on the study's objective, ISTs and physical phantom studies should aim to employ tasks that closely model actual clinical scenarios.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 Suppl 1","pages":"S13014"},"PeriodicalIF":1.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12087637/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.12.S1.S13014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: We aim to refine the task-based evaluation of full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) through in silico trials (ISTs). Previous ISTs mostly employ lesion detection tasks for task-based performance evaluation, which differ from clinical practice where the task normally involves the radiologists both detecting whether a suspicious lesion is present and rating how likely it is that the lesion is malignant. We hypothesize that differing conclusions may result from ISTs based on the defined task.
Approach: The shape of the masses was employed as a surrogate indicator for malignancy, with spiculated masses representing malignant lesions and lobular masses representing benign lesions. A convolutional neural network (CNN) model observer was then trained to differentiate between spiculated and nonspiculated masses using Monte Carlo-simulated breast images. This approach leverages prior research demonstrating that CNN-based frameworks can approximate the performance of an ideal observer. We systematically evaluated the effects of varying dose levels, detector pixel size, and projection angular range on the CNN model observer's performance in both detection and classification tasks, assessing the performance of both FFDM and DBT systems.
Results: Our findings demonstrate significant variations in conclusions drawn from IST models depending on whether the task is lesion detection or classification. Specifically, we observed that varying average glandular dose levels from 2.0 to 0.5 mGy had little effect on the detection of masses, whereas a small but significant decrease in performance with reduced dose was observed with the classification task across FFDM and DBT. Similarly, reduced spatial resolution resulted in a small but significant decrease in performance with the classification task for FFDM. For DBT ISTs, we also observed that the preferred angular range varies depending on whether the task is detection or classification.
Conclusions: Integrating classification tasks into ISTs and potentially physical phantom studies can provide additional information in the evaluation of clinical breast imaging systems. This methodology can enhance the reliability of performance assessments for new breast imaging technologies. Depending on the study's objective, ISTs and physical phantom studies should aim to employ tasks that closely model actual clinical scenarios.
期刊介绍:
JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.