Karen K Wong MD , Thaddeus Segura MIDS , Gunnar Mein MIDS , Jia Lu PhD , Elizabeth J Hannapel MPH , Jasen M Kunz MPH , Troy Ritter PhD , Jessica C Smith MPH , Alberto Todeschini PhD , Fred Nugen PhD , Chris Edens PhD
{"title":"Automated cooling tower detection through deep learning for Legionnaires’ disease outbreak investigations: a model development and validation study","authors":"Karen K Wong MD , Thaddeus Segura MIDS , Gunnar Mein MIDS , Jia Lu PhD , Elizabeth J Hannapel MPH , Jasen M Kunz MPH , Troy Ritter PhD , Jessica C Smith MPH , Alberto Todeschini PhD , Fred Nugen PhD , Chris Edens PhD","doi":"10.1016/S2589-7500(24)00094-3","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Cooling towers containing <em>Legionella</em> spp are a high-risk source of Legionnaires’ disease outbreaks. Manually locating cooling towers from aerial imagery during outbreak investigations requires expertise, is labour intensive, and can be prone to errors. We aimed to train a deep learning computer vision model to automatically detect cooling towers that are aerially visible.</p></div><div><h3>Methods</h3><p>Between Jan 1 and 31, 2021, we extracted satellite view images of Philadelphia (PN, USA) and New York state (NY, USA) from Google Maps and annotated cooling towers to create training datasets. We augmented training data with synthetic data and model-assisted labelling of additional cities. Using 2051 images containing 7292 cooling towers, we trained a two-stage model using YOLOv5, a model that detects objects in images, and EfficientNet-b5, a model that classifies images. We assessed the primary outcomes of sensitivity and positive predictive value (PPV) of the model against manual labelling on test datasets of 548 images, including from two cities not seen in training (Boston [MA, USA] and Athens [GA, USA]). We compared the search speed of the model with that of manual searching by four epidemiologists.</p></div><div><h3>Findings</h3><p>The model identified visible cooling towers with 95·1% sensitivity (95% CI 94·0–96·1) and a PPV of 90·1% (95% CI 90·0–90·2) in New York City and Philadelphia. In Boston, sensitivity was 91·6% (89·2–93·7) and PPV was 80·8% (80·5–81·2). In Athens, sensitivity was 86·9% (75·8–94·2) and PPV was 85·5% (84·2–86·7). For an area of New York City encompassing 45 blocks (0·26 square miles), the model searched more than 600 times faster (7·6 s; 351 potential cooling towers identified) than did human investigators (mean 83·75 min [SD 29·5]; mean 310·8 cooling towers [42·2]).</p></div><div><h3>Interpretation</h3><p>The model could be used to accelerate investigation and source control during outbreaks of Legionnaires’ disease through the identification of cooling towers from aerial imagery, potentially preventing additional disease spread. The model has already been used by public health teams for outbreak investigations and to initialise cooling tower registries, which are considered best practice for preventing and responding to outbreaks of Legionnaires’ disease.</p></div><div><h3>Funding</h3><p>None.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 7","pages":"Pages e500-e506"},"PeriodicalIF":23.8000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024000943/pdfft?md5=134e7afec7443d66f0fb73e4c1e6aabb&pid=1-s2.0-S2589750024000943-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024000943","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Cooling towers containing Legionella spp are a high-risk source of Legionnaires’ disease outbreaks. Manually locating cooling towers from aerial imagery during outbreak investigations requires expertise, is labour intensive, and can be prone to errors. We aimed to train a deep learning computer vision model to automatically detect cooling towers that are aerially visible.
Methods
Between Jan 1 and 31, 2021, we extracted satellite view images of Philadelphia (PN, USA) and New York state (NY, USA) from Google Maps and annotated cooling towers to create training datasets. We augmented training data with synthetic data and model-assisted labelling of additional cities. Using 2051 images containing 7292 cooling towers, we trained a two-stage model using YOLOv5, a model that detects objects in images, and EfficientNet-b5, a model that classifies images. We assessed the primary outcomes of sensitivity and positive predictive value (PPV) of the model against manual labelling on test datasets of 548 images, including from two cities not seen in training (Boston [MA, USA] and Athens [GA, USA]). We compared the search speed of the model with that of manual searching by four epidemiologists.
Findings
The model identified visible cooling towers with 95·1% sensitivity (95% CI 94·0–96·1) and a PPV of 90·1% (95% CI 90·0–90·2) in New York City and Philadelphia. In Boston, sensitivity was 91·6% (89·2–93·7) and PPV was 80·8% (80·5–81·2). In Athens, sensitivity was 86·9% (75·8–94·2) and PPV was 85·5% (84·2–86·7). For an area of New York City encompassing 45 blocks (0·26 square miles), the model searched more than 600 times faster (7·6 s; 351 potential cooling towers identified) than did human investigators (mean 83·75 min [SD 29·5]; mean 310·8 cooling towers [42·2]).
Interpretation
The model could be used to accelerate investigation and source control during outbreaks of Legionnaires’ disease through the identification of cooling towers from aerial imagery, potentially preventing additional disease spread. The model has already been used by public health teams for outbreak investigations and to initialise cooling tower registries, which are considered best practice for preventing and responding to outbreaks of Legionnaires’ disease.
期刊介绍:
The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health.
The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health.
We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.