S F Chetverikov, K M Arzamasov, A E Andreichenko, V P Novik, T M Bobrovskaya, A V Vladzimirsky
{"title":"Approaches to Sampling for Quality Control of Artificial Intelligence in Biomedical Research.","authors":"S F Chetverikov, K M Arzamasov, A E Andreichenko, V P Novik, T M Bobrovskaya, A V Vladzimirsky","doi":"10.17691/stm2023.15.2.02","DOIUrl":null,"url":null,"abstract":"<p><p><b>The aim of the study</b> is to evaluate the efficacy of approaches to sampling during periodic quality control of the artificial intelligence (AI) results in biomedical practice.</p><p><strong>Materials and methods: </strong>The approaches to sampling based on point statistical estimation, statistical hypothesis testing, employing ready-made statistical tables, as well as options of the approaches presented in GOST R ISO 2859-1-2007 \"Statistical methods. Sampling procedures for inspection by attributes\" have been analyzed. We have considered variants of sampling of different sizes for general populations from 1000 to 100,000 studies.The analysis of the approaches to sampling was carried out as part of an experiment on the use of innovative technologies in computer vision for the analysis of medical images and their further application in the healthcare system of Moscow (Russia).</p><p><strong>Results: </strong>Ready-made tables have specific statistical input data, which does not make them a universal option for biomedical research. Point statistical estimation helps to calculate a sample based on given statistical parameters with a certain confidence interval. This approach is promising in the case when only a type I error is important for the researcher, and a type II error is not a priority. Using the approach based on statistical hypothesis testing makes it possible to take account of type I and II errors based on the given statistical parameters. The application of GOST R ISO 2859-1-2007 for sampling allows using ready-made values depending on the given statistical parameters.When evaluating the efficacy of the studied approaches, it was found that for our purposes, the optimal number of studies during AI quality control for the analysis of medical images is 80 items. This meets the requirements of representativeness, balance of the risks to the consumer and the AI service provider, as well as optimization of labor costs of employees involved in the process of quality control of the AI results.</p>","PeriodicalId":51886,"journal":{"name":"Sovremennye Tehnologii v Medicine","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10306966/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sovremennye Tehnologii v Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17691/stm2023.15.2.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/3/29 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of the study is to evaluate the efficacy of approaches to sampling during periodic quality control of the artificial intelligence (AI) results in biomedical practice.
Materials and methods: The approaches to sampling based on point statistical estimation, statistical hypothesis testing, employing ready-made statistical tables, as well as options of the approaches presented in GOST R ISO 2859-1-2007 "Statistical methods. Sampling procedures for inspection by attributes" have been analyzed. We have considered variants of sampling of different sizes for general populations from 1000 to 100,000 studies.The analysis of the approaches to sampling was carried out as part of an experiment on the use of innovative technologies in computer vision for the analysis of medical images and their further application in the healthcare system of Moscow (Russia).
Results: Ready-made tables have specific statistical input data, which does not make them a universal option for biomedical research. Point statistical estimation helps to calculate a sample based on given statistical parameters with a certain confidence interval. This approach is promising in the case when only a type I error is important for the researcher, and a type II error is not a priority. Using the approach based on statistical hypothesis testing makes it possible to take account of type I and II errors based on the given statistical parameters. The application of GOST R ISO 2859-1-2007 for sampling allows using ready-made values depending on the given statistical parameters.When evaluating the efficacy of the studied approaches, it was found that for our purposes, the optimal number of studies during AI quality control for the analysis of medical images is 80 items. This meets the requirements of representativeness, balance of the risks to the consumer and the AI service provider, as well as optimization of labor costs of employees involved in the process of quality control of the AI results.
研究的目的是评估在生物医学实践中对人工智能(AI)结果进行定期质量控制期间的抽样方法的有效性:材料和方法:基于点统计估计、统计假设检验、使用现成统计表的抽样方法,以及 GOST R ISO 2859-1-2007 《统计方法》中提出的方法选项。按属性检验的抽样程序 "中提出的方法进行了分析。我们考虑了从 1000 到 100,000 个研究对象的不同规模的抽样变量。抽样方法分析是在利用计算机视觉创新技术分析医学图像及其在莫斯科(俄罗斯)医疗保健系统中的进一步应用的实验中进行的:结果:现成的表格有特定的统计输入数据,因此不能作为生物医学研究的通用选项。点统计估算有助于根据给定的统计参数和一定的置信区间计算样本。当研究人员只重视 I 型误差,而不重视 II 型误差时,这种方法大有可为。使用基于统计假设检验的方法可以根据给定的统计参数考虑 I 类和 II 类误差。应用 GOST R ISO 2859-1-2007 进行抽样,可以根据给定的统计参数使用现成的数值。在评估所研究方法的有效性时,我们发现,就我们的目的而言,在人工智能质量控制期间对医学图像进行分析的最佳研究数量为 80 项。这符合代表性、平衡消费者和人工智能服务提供商的风险以及优化人工智能结果质量控制过程中员工劳动成本的要求。