Kim A Lindblade, Corine Ngufor, William Yavo, Sunday Atobatele, Arthur Mpimbaza, Nelson Ssewante, Ese Akpiroroh, Abibatou Konaté-Toure, Idelphonse Ahogni, Augustin Kpemasse, Antoine Mea Tanoh, Godwin Ntadom, Jimmy Opigo, Stephanie Zobrist, Kevin Griffith, Michael Humes
{"title":"在贝宁、Côte科特迪瓦、尼日利亚和乌干达评估用于疟疾快速诊断测试的基于人工智能的电子阅读器的性能。","authors":"Kim A Lindblade, Corine Ngufor, William Yavo, Sunday Atobatele, Arthur Mpimbaza, Nelson Ssewante, Ese Akpiroroh, Abibatou Konaté-Toure, Idelphonse Ahogni, Augustin Kpemasse, Antoine Mea Tanoh, Godwin Ntadom, Jimmy Opigo, Stephanie Zobrist, Kevin Griffith, Michael Humes","doi":"10.1186/s12936-025-05522-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The introduction of malaria rapid diagnostic tests (RDTs) has expanded the parasitological confirmation of malaria at all levels of health systems in sub-Saharan Africa, improving case management and surveillance. However, concerns persist regarding healthcare worker adherence to RDT outcomes and the accuracy of RDT results recorded in health facility registers. Electronic RDT readers have been proposed to improve the consistency of interpretation and reporting. The HealthPulse smartphone application (Audere, Seattle, WA, USA), an RDT reader using an artificial intelligence (AI) computer vision algorithm, was assessed against a trained human panel interpreting RDT results from photographs to determine the application's performance characteristics.</p><p><strong>Methods: </strong>In 2023, the Malaria Rapid Diagnostic Test Capture and Reporting Assessment (MaCRA) was implemented in health facilities in Benin, Côte d'Ivoire, Nigeria, and Uganda. Study staff photographed malaria RDTs using the HealthPulse application after healthcare workers performed and interpreted the tests. A trained panel of external reviewers interpreted the RDT images and served as the reference standard. RDTs in the images were classified according to the manufacturer's instructions as positive, negative or invalid (i.e., no visible control line) or labelled as uninterpretable (i.e., visibility was impeded). The performance of the HealthPulse AI algorithm was evaluated using percent accuracy, recall (i.e., sensitivity and specificity), precision (i.e., positive and negative predictive values), and F1 scores (harmonic mean of recall and precision) weighted by the number of each outcome. Logistic regression was applied to assess factors influencing recall across countries, RDT products, presence of faint lines, and anomalies.</p><p><strong>Results: </strong>Of the 110,843 RDT images collected, 106,877 (96.4%) were included in the analysis. The AI algorithm demonstrated high accuracy (96.8%; 95% confidence interval (CI) 96.7%, 96.9%) compared with the panel interpretation and an overall F1 score of 96.6. Recall and precision were > 97% for positive and negative outcomes but much lower for invalid (recall: 84.8%; precision: 42.8%) and uninterpretable (recall: 0.8%; precision: 2.3%) classifications. AI performance varied by country, RDT product, the presence of faint lines and the quality of the image. When test lines were faint, the AI algorithm was significantly less likely to recall both positive results (adjusted odds ratio (aOR) 0.02; 95% CI 0.02, 0.02) and negative results (aOR 0.10; 95% CI 0.07, 0.16).</p><p><strong>Conclusions: </strong>The HealthPulse AI algorithm demonstrated strong agreement with a trained panel in interpreting malaria RDT images across diverse settings. However, the reduced performance for invalid outcomes and varying performance by country, RDT product and faint lines highlight the need for further research and refinement. The HealthPulse application shows potential as a supportive tool in research, training, surveillance, and quality assurance.</p>","PeriodicalId":18317,"journal":{"name":"Malaria Journal","volume":"24 1","pages":"302"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12486784/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating the performance of an artificial intelligence-based electronic reader for malaria rapid diagnostic tests across Benin, Côte d'Ivoire, Nigeria and Uganda.\",\"authors\":\"Kim A Lindblade, Corine Ngufor, William Yavo, Sunday Atobatele, Arthur Mpimbaza, Nelson Ssewante, Ese Akpiroroh, Abibatou Konaté-Toure, Idelphonse Ahogni, Augustin Kpemasse, Antoine Mea Tanoh, Godwin Ntadom, Jimmy Opigo, Stephanie Zobrist, Kevin Griffith, Michael Humes\",\"doi\":\"10.1186/s12936-025-05522-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The introduction of malaria rapid diagnostic tests (RDTs) has expanded the parasitological confirmation of malaria at all levels of health systems in sub-Saharan Africa, improving case management and surveillance. However, concerns persist regarding healthcare worker adherence to RDT outcomes and the accuracy of RDT results recorded in health facility registers. Electronic RDT readers have been proposed to improve the consistency of interpretation and reporting. The HealthPulse smartphone application (Audere, Seattle, WA, USA), an RDT reader using an artificial intelligence (AI) computer vision algorithm, was assessed against a trained human panel interpreting RDT results from photographs to determine the application's performance characteristics.</p><p><strong>Methods: </strong>In 2023, the Malaria Rapid Diagnostic Test Capture and Reporting Assessment (MaCRA) was implemented in health facilities in Benin, Côte d'Ivoire, Nigeria, and Uganda. Study staff photographed malaria RDTs using the HealthPulse application after healthcare workers performed and interpreted the tests. A trained panel of external reviewers interpreted the RDT images and served as the reference standard. RDTs in the images were classified according to the manufacturer's instructions as positive, negative or invalid (i.e., no visible control line) or labelled as uninterpretable (i.e., visibility was impeded). The performance of the HealthPulse AI algorithm was evaluated using percent accuracy, recall (i.e., sensitivity and specificity), precision (i.e., positive and negative predictive values), and F1 scores (harmonic mean of recall and precision) weighted by the number of each outcome. Logistic regression was applied to assess factors influencing recall across countries, RDT products, presence of faint lines, and anomalies.</p><p><strong>Results: </strong>Of the 110,843 RDT images collected, 106,877 (96.4%) were included in the analysis. The AI algorithm demonstrated high accuracy (96.8%; 95% confidence interval (CI) 96.7%, 96.9%) compared with the panel interpretation and an overall F1 score of 96.6. Recall and precision were > 97% for positive and negative outcomes but much lower for invalid (recall: 84.8%; precision: 42.8%) and uninterpretable (recall: 0.8%; precision: 2.3%) classifications. AI performance varied by country, RDT product, the presence of faint lines and the quality of the image. When test lines were faint, the AI algorithm was significantly less likely to recall both positive results (adjusted odds ratio (aOR) 0.02; 95% CI 0.02, 0.02) and negative results (aOR 0.10; 95% CI 0.07, 0.16).</p><p><strong>Conclusions: </strong>The HealthPulse AI algorithm demonstrated strong agreement with a trained panel in interpreting malaria RDT images across diverse settings. However, the reduced performance for invalid outcomes and varying performance by country, RDT product and faint lines highlight the need for further research and refinement. The HealthPulse application shows potential as a supportive tool in research, training, surveillance, and quality assurance.</p>\",\"PeriodicalId\":18317,\"journal\":{\"name\":\"Malaria Journal\",\"volume\":\"24 1\",\"pages\":\"302\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12486784/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Malaria Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12936-025-05522-3\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Malaria Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12936-025-05522-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
摘要
背景:疟疾快速诊断检测(RDTs)的引入扩大了撒哈拉以南非洲各级卫生系统对疟疾的寄生虫学确认,改善了病例管理和监测。然而,卫生保健工作者对RDT结果的依从性以及卫生机构登记册中记录的RDT结果的准确性仍然令人担忧。建议使用电子RDT读卡器来提高解释和报告的一致性。HealthPulse智能手机应用程序(audiere, Seattle, WA, USA)是一款使用人工智能(AI)计算机视觉算法的RDT读取器,通过训练有素的人员小组对照片中的RDT结果进行评估,以确定应用程序的性能特征。方法:2023年,在贝宁、Côte科特迪瓦、尼日利亚和乌干达的卫生机构实施疟疾快速诊断检测捕获和报告评估(MaCRA)。研究人员在卫生保健工作者执行和解释测试后,使用HealthPulse应用程序拍摄疟疾快速诊断试验。经过培训的外部评审员小组解释RDT图像并作为参考标准。根据制造商的说明,将图像中的rdt分类为阳性、阴性或无效(即没有可见的控制线)或标记为不可解释(即能见度受到阻碍)。HealthPulse AI算法的性能评估使用百分比准确率、召回率(即灵敏度和特异性)、精度(即阳性和阴性预测值)和F1分数(召回率和准确率的调和平均值)对每个结果的数量进行加权。应用逻辑回归来评估影响各国召回的因素,RDT产品,存在微弱的线条,和异常。结果:收集到的110,843张RDT图像中,106,877张(96.4%)被纳入分析。与面板解释相比,AI算法具有较高的准确率(96.8%;95%置信区间(CI) 96.7%, 96.9%), F1总分为96.6分。正面和负面结果的查全率和查准率均为97%,但无效分类(查全率:84.8%,查准率:42.8%)和不可解释分类(查全率:0.8%,查准率:2.3%)的查全率和查准率要低得多。人工智能的表现因国家、RDT产品、模糊线条的存在和图像质量而异。当测试线模糊时,人工智能算法回忆两个阳性结果的可能性显著降低(调整优势比(aOR) 0.02;95% CI 0.02, 0.02)和阴性结果(aOR 0.10; 95% CI 0.07, 0.16)。结论:HealthPulse人工智能算法在解释不同环境下的疟疾RDT图像方面与训练有素的小组表现出强烈的一致性。然而,无效结果的性能降低以及国家、RDT产品和模糊线的不同性能突出了进一步研究和改进的必要性。HealthPulse应用程序显示了在研究、培训、监督和质量保证方面作为辅助工具的潜力。
Evaluating the performance of an artificial intelligence-based electronic reader for malaria rapid diagnostic tests across Benin, Côte d'Ivoire, Nigeria and Uganda.
Background: The introduction of malaria rapid diagnostic tests (RDTs) has expanded the parasitological confirmation of malaria at all levels of health systems in sub-Saharan Africa, improving case management and surveillance. However, concerns persist regarding healthcare worker adherence to RDT outcomes and the accuracy of RDT results recorded in health facility registers. Electronic RDT readers have been proposed to improve the consistency of interpretation and reporting. The HealthPulse smartphone application (Audere, Seattle, WA, USA), an RDT reader using an artificial intelligence (AI) computer vision algorithm, was assessed against a trained human panel interpreting RDT results from photographs to determine the application's performance characteristics.
Methods: In 2023, the Malaria Rapid Diagnostic Test Capture and Reporting Assessment (MaCRA) was implemented in health facilities in Benin, Côte d'Ivoire, Nigeria, and Uganda. Study staff photographed malaria RDTs using the HealthPulse application after healthcare workers performed and interpreted the tests. A trained panel of external reviewers interpreted the RDT images and served as the reference standard. RDTs in the images were classified according to the manufacturer's instructions as positive, negative or invalid (i.e., no visible control line) or labelled as uninterpretable (i.e., visibility was impeded). The performance of the HealthPulse AI algorithm was evaluated using percent accuracy, recall (i.e., sensitivity and specificity), precision (i.e., positive and negative predictive values), and F1 scores (harmonic mean of recall and precision) weighted by the number of each outcome. Logistic regression was applied to assess factors influencing recall across countries, RDT products, presence of faint lines, and anomalies.
Results: Of the 110,843 RDT images collected, 106,877 (96.4%) were included in the analysis. The AI algorithm demonstrated high accuracy (96.8%; 95% confidence interval (CI) 96.7%, 96.9%) compared with the panel interpretation and an overall F1 score of 96.6. Recall and precision were > 97% for positive and negative outcomes but much lower for invalid (recall: 84.8%; precision: 42.8%) and uninterpretable (recall: 0.8%; precision: 2.3%) classifications. AI performance varied by country, RDT product, the presence of faint lines and the quality of the image. When test lines were faint, the AI algorithm was significantly less likely to recall both positive results (adjusted odds ratio (aOR) 0.02; 95% CI 0.02, 0.02) and negative results (aOR 0.10; 95% CI 0.07, 0.16).
Conclusions: The HealthPulse AI algorithm demonstrated strong agreement with a trained panel in interpreting malaria RDT images across diverse settings. However, the reduced performance for invalid outcomes and varying performance by country, RDT product and faint lines highlight the need for further research and refinement. The HealthPulse application shows potential as a supportive tool in research, training, surveillance, and quality assurance.
期刊介绍:
Malaria Journal is aimed at the scientific community interested in malaria in its broadest sense. It is the only journal that publishes exclusively articles on malaria and, as such, it aims to bring together knowledge from the different specialities involved in this very broad discipline, from the bench to the bedside and to the field.