Machine learning algorithm for the rapid and accurate detection of Plasmodium falciparum

IF 4.8 2区医学 Q1 INFECTIOUS DISEASES

International Journal of Infectious Diseases Pub Date : 2025-03-01 DOI:10.1016/j.ijid.2024.107439

Mr Andrew Hill

{"title":"Machine learning algorithm for the rapid and accurate detection of Plasmodium falciparum","authors":"Mr Andrew Hill","doi":"10.1016/j.ijid.2024.107439","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Manual cell counting is a malaria diagnostic bottleneck which could be alleviated by assistance from automated labelling. The high prevalence of malaria in under-developed regions requires highly precise and computationally efficient models to achieve rapid and accurate diagnosis, which in turn has the potential to be developed into a smartphone app.</div></div><div><h3>Methods</h3><div>Machine learning algorithms (MLA) consisting of a family of tiny (3,911 to 100,000 parameters) hybrid convolutional neural network / encoder-decoder models were developed which output a both a label {Parasite, Normal} and a confidence. The models were evaluated (k-fold validation) against an established Plasmodium falciparum cell dataset from the NIH.</div></div><div><h3>Results</h3><div>The models achieve between 95% and 98.5% accuracy. Labelling cells with a probability of malaria of 10-99% as uncertain, and ignoring them in analysis resulted in >99% accuracy for the remaining cells. Accuracy measurement is limited by mislabelled cells, with as little as 120 cells in 27,000 (0.4%) confident but wrong. Consensus between 8 independent models suggests at least 150 training cells (more than 50% of all “confident but wrong” cells) are mislabelled, and training without these cells improves model convergence and reliability.</div></div><div><h3>Discussion</h3><div>MLAs that assist diagnosis can be relied upon if they output certainty, and a confident diagnosis can be formed from only certain labels. In many cases a low percentage of cells with uncertain labels will not change diagnosis. Knowing that almost all cell labelling errors occur within the uncertain cells would enable a clinical workflow where expert time is focused on marginal cells within marginal cases. Larger models are prone to overfitting while their size limits the hardware they can be run on.</div></div><div><h3>Conclusion</h3><div>Accurate Plasmodium falciparum parasite identification is possible with 12,000 parameter models. Automation of bulk labelling work would allow expert time to be focused on cases where uncertainty would affect diagnosis. A path to reliable, rapid and mobile malaria diagnosis has been identified based on tiny models suitable for mobile phone deployment in poor malaria affected countries. Further work to enable rapid response to malaria is required.</div></div>","PeriodicalId":14006,"journal":{"name":"International Journal of Infectious Diseases","volume":"152 ","pages":"Article 107439"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Infectious Diseases","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1201971224005149","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Manual cell counting is a malaria diagnostic bottleneck which could be alleviated by assistance from automated labelling. The high prevalence of malaria in under-developed regions requires highly precise and computationally efficient models to achieve rapid and accurate diagnosis, which in turn has the potential to be developed into a smartphone app.

Methods

Machine learning algorithms (MLA) consisting of a family of tiny (3,911 to 100,000 parameters) hybrid convolutional neural network / encoder-decoder models were developed which output a both a label {Parasite, Normal} and a confidence. The models were evaluated (k-fold validation) against an established Plasmodium falciparum cell dataset from the NIH.

Results

The models achieve between 95% and 98.5% accuracy. Labelling cells with a probability of malaria of 10-99% as uncertain, and ignoring them in analysis resulted in >99% accuracy for the remaining cells. Accuracy measurement is limited by mislabelled cells, with as little as 120 cells in 27,000 (0.4%) confident but wrong. Consensus between 8 independent models suggests at least 150 training cells (more than 50% of all “confident but wrong” cells) are mislabelled, and training without these cells improves model convergence and reliability.

Discussion

MLAs that assist diagnosis can be relied upon if they output certainty, and a confident diagnosis can be formed from only certain labels. In many cases a low percentage of cells with uncertain labels will not change diagnosis. Knowing that almost all cell labelling errors occur within the uncertain cells would enable a clinical workflow where expert time is focused on marginal cells within marginal cases. Larger models are prone to overfitting while their size limits the hardware they can be run on.

Conclusion

Accurate Plasmodium falciparum parasite identification is possible with 12,000 parameter models. Automation of bulk labelling work would allow expert time to be focused on cases where uncertainty would affect diagnosis. A path to reliable, rapid and mobile malaria diagnosis has been identified based on tiny models suitable for mobile phone deployment in poor malaria affected countries. Further work to enable rapid response to malaria is required.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Infectious Diseases 医学-传染病学

CiteScore

18.90

自引率

2.40%

发文量

1020

审稿时长

30 days

期刊介绍： International Journal of Infectious Diseases (IJID) Publisher: International Society for Infectious Diseases Publication Frequency: Monthly Type: Peer-reviewed, Open Access Scope: Publishes original clinical and laboratory-based research. Reports clinical trials, reviews, and some case reports. Focuses on epidemiology, clinical diagnosis, treatment, and control of infectious diseases. Emphasizes diseases common in under-resourced countries.