Anni King, George E Fowler, Rhiannon C Macefield, Hamish Walker, Charlie Thomas, Sheraz Markar, Ethan Higgins, Jane M Blazeby, Natalie S Blencowe
{"title":"Use of artificial intelligence in the analysis of digital videos of invasive surgical procedures: scoping review.","authors":"Anni King, George E Fowler, Rhiannon C Macefield, Hamish Walker, Charlie Thomas, Sheraz Markar, Ethan Higgins, Jane M Blazeby, Natalie S Blencowe","doi":"10.1093/bjsopen/zraf073","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Surgical videos are a valuable data source, offering detailed insights into surgical practice. However, video analysis requires specialist clinical knowledge and takes considerable time. Artificial intelligence (AI) has the potential to improve and streamline the interpretation of intraoperative video data. This systematic scoping review aimed to summarize the use of AI in the analysis of videos of surgical procedures and identify evidence gaps.</p><p><strong>Methods: </strong>Systematic searches of Ovid MEDLINE and Embase were performed using search terms 'artificial intelligence', 'video', and 'surgery'. Data extraction included reporting of general study characteristics; the overall objective of AI; descriptions of data sets, AI models, and training; methods of data annotation; and measures of accuracy. Data were summarized descriptively.</p><p><strong>Results: </strong>In all, 122 studies were included. More than half focused on gastrointestinal procedures (75 studies, 61.5%), predominantly cholecystectomy (47, 38.5%). The most common objectives were surgical phase recognition (40 studies, 32.8%), surgical instrument recognition (28, 23.0%), and enhanced intraoperative visualization (23, 18.9%). Of the studies, 79.5% (97) used a single data set and most (92, 75.4%) used supervised machine learning techniques. There was considerable variation across the studies in terms of the number of videos, centres, and contributing surgeons. Forty-seven studies (38.5%) did not report the number of annotators, and details about their experience were frequently omitted (102, 83.6%). Most studies used multiple outcome measures (67, 54.9%), most commonly overall or best accuracy of the AI model (67, 54.9%).</p><p><strong>Conclusion: </strong>This review found that many studies omitted essential methodological details of AI training, testing, data annotation, and validation processes, creating difficulties when interpreting and replicating these studies. Another key finding was the lack of large data sets from multiple centres and surgeons. Future research should focus on curating large, varied, open-access data sets from multiple centres, patients, and surgeons to facilitate accurate evaluation using real-world data.</p>","PeriodicalId":9028,"journal":{"name":"BJS Open","volume":"9 4","pages":""},"PeriodicalIF":4.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12268333/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJS Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/bjsopen/zraf073","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Surgical videos are a valuable data source, offering detailed insights into surgical practice. However, video analysis requires specialist clinical knowledge and takes considerable time. Artificial intelligence (AI) has the potential to improve and streamline the interpretation of intraoperative video data. This systematic scoping review aimed to summarize the use of AI in the analysis of videos of surgical procedures and identify evidence gaps.
Methods: Systematic searches of Ovid MEDLINE and Embase were performed using search terms 'artificial intelligence', 'video', and 'surgery'. Data extraction included reporting of general study characteristics; the overall objective of AI; descriptions of data sets, AI models, and training; methods of data annotation; and measures of accuracy. Data were summarized descriptively.
Results: In all, 122 studies were included. More than half focused on gastrointestinal procedures (75 studies, 61.5%), predominantly cholecystectomy (47, 38.5%). The most common objectives were surgical phase recognition (40 studies, 32.8%), surgical instrument recognition (28, 23.0%), and enhanced intraoperative visualization (23, 18.9%). Of the studies, 79.5% (97) used a single data set and most (92, 75.4%) used supervised machine learning techniques. There was considerable variation across the studies in terms of the number of videos, centres, and contributing surgeons. Forty-seven studies (38.5%) did not report the number of annotators, and details about their experience were frequently omitted (102, 83.6%). Most studies used multiple outcome measures (67, 54.9%), most commonly overall or best accuracy of the AI model (67, 54.9%).
Conclusion: This review found that many studies omitted essential methodological details of AI training, testing, data annotation, and validation processes, creating difficulties when interpreting and replicating these studies. Another key finding was the lack of large data sets from multiple centres and surgeons. Future research should focus on curating large, varied, open-access data sets from multiple centres, patients, and surgeons to facilitate accurate evaluation using real-world data.