Machine learning algorithm for the rapid and accurate detection of Plasmodium falciparum

IF 4.8 2区 医学 Q1 INFECTIOUS DISEASES
Mr Andrew Hill
{"title":"Machine learning algorithm for the rapid and accurate detection of Plasmodium falciparum","authors":"Mr Andrew Hill","doi":"10.1016/j.ijid.2024.107439","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Manual cell counting is a malaria diagnostic bottleneck which could be alleviated by assistance from automated labelling. The high prevalence of malaria in under-developed regions requires highly precise and computationally efficient models to achieve rapid and accurate diagnosis, which in turn has the potential to be developed into a smartphone app.</div></div><div><h3>Methods</h3><div>Machine learning algorithms (MLA) consisting of a family of tiny (3,911 to 100,000 parameters) hybrid convolutional neural network / encoder-decoder models were developed which output a both a label {Parasite, Normal} and a confidence. The models were evaluated (k-fold validation) against an established Plasmodium falciparum cell dataset from the NIH.</div></div><div><h3>Results</h3><div>The models achieve between 95% and 98.5% accuracy. Labelling cells with a probability of malaria of 10-99% as uncertain, and ignoring them in analysis resulted in &gt;99% accuracy for the remaining cells. Accuracy measurement is limited by mislabelled cells, with as little as 120 cells in 27,000 (0.4%) confident but wrong. Consensus between 8 independent models suggests at least 150 training cells (more than 50% of all “confident but wrong” cells) are mislabelled, and training without these cells improves model convergence and reliability.</div></div><div><h3>Discussion</h3><div>MLAs that assist diagnosis can be relied upon if they output certainty, and a confident diagnosis can be formed from only certain labels. In many cases a low percentage of cells with uncertain labels will not change diagnosis. Knowing that almost all cell labelling errors occur within the uncertain cells would enable a clinical workflow where expert time is focused on marginal cells within marginal cases. Larger models are prone to overfitting while their size limits the hardware they can be run on.</div></div><div><h3>Conclusion</h3><div>Accurate Plasmodium falciparum parasite identification is possible with 12,000 parameter models. Automation of bulk labelling work would allow expert time to be focused on cases where uncertainty would affect diagnosis. A path to reliable, rapid and mobile malaria diagnosis has been identified based on tiny models suitable for mobile phone deployment in poor malaria affected countries. Further work to enable rapid response to malaria is required.</div></div>","PeriodicalId":14006,"journal":{"name":"International Journal of Infectious Diseases","volume":"152 ","pages":"Article 107439"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Infectious Diseases","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1201971224005149","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Manual cell counting is a malaria diagnostic bottleneck which could be alleviated by assistance from automated labelling. The high prevalence of malaria in under-developed regions requires highly precise and computationally efficient models to achieve rapid and accurate diagnosis, which in turn has the potential to be developed into a smartphone app.

Methods

Machine learning algorithms (MLA) consisting of a family of tiny (3,911 to 100,000 parameters) hybrid convolutional neural network / encoder-decoder models were developed which output a both a label {Parasite, Normal} and a confidence. The models were evaluated (k-fold validation) against an established Plasmodium falciparum cell dataset from the NIH.

Results

The models achieve between 95% and 98.5% accuracy. Labelling cells with a probability of malaria of 10-99% as uncertain, and ignoring them in analysis resulted in >99% accuracy for the remaining cells. Accuracy measurement is limited by mislabelled cells, with as little as 120 cells in 27,000 (0.4%) confident but wrong. Consensus between 8 independent models suggests at least 150 training cells (more than 50% of all “confident but wrong” cells) are mislabelled, and training without these cells improves model convergence and reliability.

Discussion

MLAs that assist diagnosis can be relied upon if they output certainty, and a confident diagnosis can be formed from only certain labels. In many cases a low percentage of cells with uncertain labels will not change diagnosis. Knowing that almost all cell labelling errors occur within the uncertain cells would enable a clinical workflow where expert time is focused on marginal cells within marginal cases. Larger models are prone to overfitting while their size limits the hardware they can be run on.

Conclusion

Accurate Plasmodium falciparum parasite identification is possible with 12,000 parameter models. Automation of bulk labelling work would allow expert time to be focused on cases where uncertainty would affect diagnosis. A path to reliable, rapid and mobile malaria diagnosis has been identified based on tiny models suitable for mobile phone deployment in poor malaria affected countries. Further work to enable rapid response to malaria is required.
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
18.90
自引率
2.40%
发文量
1020
审稿时长
30 days
期刊介绍: International Journal of Infectious Diseases (IJID) Publisher: International Society for Infectious Diseases Publication Frequency: Monthly Type: Peer-reviewed, Open Access Scope: Publishes original clinical and laboratory-based research. Reports clinical trials, reviews, and some case reports. Focuses on epidemiology, clinical diagnosis, treatment, and control of infectious diseases. Emphasizes diseases common in under-resourced countries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信