Optimizing a de novo artificial intelligence-based medical device under a predetermined change control plan: Improved ability to detect or rule out pediatric autism
Dennis P. Wall , Stuart Liu-Mayo , Carmela Salomon , Jennifer Shannon , Sharief Taraman
{"title":"Optimizing a de novo artificial intelligence-based medical device under a predetermined change control plan: Improved ability to detect or rule out pediatric autism","authors":"Dennis P. Wall , Stuart Liu-Mayo , Carmela Salomon , Jennifer Shannon , Sharief Taraman","doi":"10.1016/j.ibmed.2023.100102","DOIUrl":null,"url":null,"abstract":"<div><p>A growing number of artificial intelligence-based medical devices are receiving clearance from the Food and Drug Administration (FDA). Debate has arisen about best practices for the regulation and safe oversight of such devices whose capabilities, if “unlocked”, include iterative learning and adaptation with exposure to new data. One regulatory mechanism proposed by the FDA is the predetermined change control plan (PCCP). This analysis provides what we believe would be the first example of how a PCCP has been leveraged to improve the performance of a de novo autism diagnostic device in practice. Following the PCCP's model update procedures included in the marketing authorization of the first generation of the device (“algorithm V1”), we conducted an algorithmic threshold optimization procedure to improve the device's ability to detect or rule out autism in children ages 18–72 months without changing the accuracy or intended use of the device. Decision threshold optimization was achieved using a repeated train/test validation procedure on a dataset of 722 children with concern for developmental delay, aged 18–72 months (28% autism, 22% neurotypical, 50% other developmental delay, mean age 3.6 years, 39% female). In 1000 repeats, 70% of samples were selected for threshold optimization and 30% for evaluation, ensuring that no training data appeared in the test set. Out-of-sample performance was estimated by evaluating the selected threshold pair on the test set and comparing the performance metrics of the new pair to the corresponding V1 metrics on the same test set. The device, with optimized decision thresholds, produced a determinate output for 66.5% (95% CI, 62.5–71.0) of children. Positive Predictive Value (PPV) and Negative Predictive Value (PPV) were 87.5% (95% CI, 82.5–96.7) and 95.6% (95% CI, 93.7–97.9) respectively. Threshold optimization improved the device's ability to accurately detect or rule out autism in a greater proportion of children. Given the current waitlist crisis for autism evaluations in the United States, the potential increase in coverage offered by the optimized thresholds is promising and emphasizes the value of regulatory mechanisms that allow software as medical devices to adapt safely and appropriately given real world data.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"8 ","pages":"Article 100102"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521223000169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A growing number of artificial intelligence-based medical devices are receiving clearance from the Food and Drug Administration (FDA). Debate has arisen about best practices for the regulation and safe oversight of such devices whose capabilities, if “unlocked”, include iterative learning and adaptation with exposure to new data. One regulatory mechanism proposed by the FDA is the predetermined change control plan (PCCP). This analysis provides what we believe would be the first example of how a PCCP has been leveraged to improve the performance of a de novo autism diagnostic device in practice. Following the PCCP's model update procedures included in the marketing authorization of the first generation of the device (“algorithm V1”), we conducted an algorithmic threshold optimization procedure to improve the device's ability to detect or rule out autism in children ages 18–72 months without changing the accuracy or intended use of the device. Decision threshold optimization was achieved using a repeated train/test validation procedure on a dataset of 722 children with concern for developmental delay, aged 18–72 months (28% autism, 22% neurotypical, 50% other developmental delay, mean age 3.6 years, 39% female). In 1000 repeats, 70% of samples were selected for threshold optimization and 30% for evaluation, ensuring that no training data appeared in the test set. Out-of-sample performance was estimated by evaluating the selected threshold pair on the test set and comparing the performance metrics of the new pair to the corresponding V1 metrics on the same test set. The device, with optimized decision thresholds, produced a determinate output for 66.5% (95% CI, 62.5–71.0) of children. Positive Predictive Value (PPV) and Negative Predictive Value (PPV) were 87.5% (95% CI, 82.5–96.7) and 95.6% (95% CI, 93.7–97.9) respectively. Threshold optimization improved the device's ability to accurately detect or rule out autism in a greater proportion of children. Given the current waitlist crisis for autism evaluations in the United States, the potential increase in coverage offered by the optimized thresholds is promising and emphasizes the value of regulatory mechanisms that allow software as medical devices to adapt safely and appropriately given real world data.