Dérick G F Borges, Eluã R Coutinho, Thiago Cerqueira-Silva, Malú Grave, Adriano O Vasconcelos, Luiz Landau, Alvaro L G A Coutinho, Pablo Ivan P Ramos, Manoel Barral-Netto, Suani T R Pinho, Marcos E Barreto, Roberto F S Andrade
{"title":"结合机器学习和动态系统技术在常规收集的初级卫生保健记录中早期发现呼吸道疫情。","authors":"Dérick G F Borges, Eluã R Coutinho, Thiago Cerqueira-Silva, Malú Grave, Adriano O Vasconcelos, Luiz Landau, Alvaro L G A Coutinho, Pablo Ivan P Ramos, Manoel Barral-Netto, Suani T R Pinho, Marcos E Barreto, Roberto F S Andrade","doi":"10.1186/s12874-025-02542-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Methods that enable early outbreak detection represent powerful tools in epidemiological surveillance, allowing adequate planning and timely response to disease surges. Syndromic surveillance data collected from primary healthcare encounters can be used as a proxy for the incidence of confirmed cases of respiratory diseases. Deviations from historical trends in encounter numbers can provide valuable insights into emerging diseases with the potential to trigger widespread outbreaks.</p><p><strong>Methods: </strong>Unsupervised machine learning methods and dynamical systems concepts were combined into the Mixed Model of Artificial Intelligence and Next-Generation (MMAING) ensemble, which aims to detect early signs of outbreaks based on primary healthcare encounters. We used data from 27 Brazilian health regions, which cover 41% of the country's territory, from 2017-2023 to identify anomalous increases in primary healthcare encounters that could be associated with an epidemic onset. Our validation approach comprised (i) a comparative analysis across Brazilian capitals; (ii) an analysis of warning signs for the COVID-19 period; and (iii) a comparison with related surveillance methods (namely EARS C1, C2, C3) based on real and synthetic labeled data.</p><p><strong>Results: </strong>The MMAING ensemble demonstrated its effectiveness in early outbreak detection using both actual and synthetic data, outperforming other surveillance methods. It successfully detected early warning signals in synthetic data, achieving a probability of detection of 86%, a positive predictive value of 85%, and an average reliability of 79%. When compared to EARS C1, C2, and C3, it exhibited superior performance based on receiver operating characteristic (ROC) curve results on synthetic data. When evaluated on real-world data, MMAING performed on par with EARS C2. Notably, the MMAING ensemble accurately predicted the onset of the four waves of the COVID-19 period in Brazil, further validating its effectiveness in real-world scenarios.</p><p><strong>Conclusion: </strong>Identifying trends in time series data related to primary healthcare encounters indicated the possibility of developing a reliable method for the early detection of outbreaks. MMAING demonstrated consistent identification capabilities across various scenarios, outperforming established reference methods.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"99"},"PeriodicalIF":3.9000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12004868/pdf/","citationCount":"0","resultStr":"{\"title\":\"Combining machine learning and dynamic system techniques to early detection of respiratory outbreaks in routinely collected primary healthcare records.\",\"authors\":\"Dérick G F Borges, Eluã R Coutinho, Thiago Cerqueira-Silva, Malú Grave, Adriano O Vasconcelos, Luiz Landau, Alvaro L G A Coutinho, Pablo Ivan P Ramos, Manoel Barral-Netto, Suani T R Pinho, Marcos E Barreto, Roberto F S Andrade\",\"doi\":\"10.1186/s12874-025-02542-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Methods that enable early outbreak detection represent powerful tools in epidemiological surveillance, allowing adequate planning and timely response to disease surges. Syndromic surveillance data collected from primary healthcare encounters can be used as a proxy for the incidence of confirmed cases of respiratory diseases. Deviations from historical trends in encounter numbers can provide valuable insights into emerging diseases with the potential to trigger widespread outbreaks.</p><p><strong>Methods: </strong>Unsupervised machine learning methods and dynamical systems concepts were combined into the Mixed Model of Artificial Intelligence and Next-Generation (MMAING) ensemble, which aims to detect early signs of outbreaks based on primary healthcare encounters. We used data from 27 Brazilian health regions, which cover 41% of the country's territory, from 2017-2023 to identify anomalous increases in primary healthcare encounters that could be associated with an epidemic onset. Our validation approach comprised (i) a comparative analysis across Brazilian capitals; (ii) an analysis of warning signs for the COVID-19 period; and (iii) a comparison with related surveillance methods (namely EARS C1, C2, C3) based on real and synthetic labeled data.</p><p><strong>Results: </strong>The MMAING ensemble demonstrated its effectiveness in early outbreak detection using both actual and synthetic data, outperforming other surveillance methods. It successfully detected early warning signals in synthetic data, achieving a probability of detection of 86%, a positive predictive value of 85%, and an average reliability of 79%. When compared to EARS C1, C2, and C3, it exhibited superior performance based on receiver operating characteristic (ROC) curve results on synthetic data. When evaluated on real-world data, MMAING performed on par with EARS C2. Notably, the MMAING ensemble accurately predicted the onset of the four waves of the COVID-19 period in Brazil, further validating its effectiveness in real-world scenarios.</p><p><strong>Conclusion: </strong>Identifying trends in time series data related to primary healthcare encounters indicated the possibility of developing a reliable method for the early detection of outbreaks. MMAING demonstrated consistent identification capabilities across various scenarios, outperforming established reference methods.</p>\",\"PeriodicalId\":9114,\"journal\":{\"name\":\"BMC Medical Research Methodology\",\"volume\":\"25 1\",\"pages\":\"99\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12004868/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Research Methodology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12874-025-02542-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-025-02542-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Combining machine learning and dynamic system techniques to early detection of respiratory outbreaks in routinely collected primary healthcare records.
Background: Methods that enable early outbreak detection represent powerful tools in epidemiological surveillance, allowing adequate planning and timely response to disease surges. Syndromic surveillance data collected from primary healthcare encounters can be used as a proxy for the incidence of confirmed cases of respiratory diseases. Deviations from historical trends in encounter numbers can provide valuable insights into emerging diseases with the potential to trigger widespread outbreaks.
Methods: Unsupervised machine learning methods and dynamical systems concepts were combined into the Mixed Model of Artificial Intelligence and Next-Generation (MMAING) ensemble, which aims to detect early signs of outbreaks based on primary healthcare encounters. We used data from 27 Brazilian health regions, which cover 41% of the country's territory, from 2017-2023 to identify anomalous increases in primary healthcare encounters that could be associated with an epidemic onset. Our validation approach comprised (i) a comparative analysis across Brazilian capitals; (ii) an analysis of warning signs for the COVID-19 period; and (iii) a comparison with related surveillance methods (namely EARS C1, C2, C3) based on real and synthetic labeled data.
Results: The MMAING ensemble demonstrated its effectiveness in early outbreak detection using both actual and synthetic data, outperforming other surveillance methods. It successfully detected early warning signals in synthetic data, achieving a probability of detection of 86%, a positive predictive value of 85%, and an average reliability of 79%. When compared to EARS C1, C2, and C3, it exhibited superior performance based on receiver operating characteristic (ROC) curve results on synthetic data. When evaluated on real-world data, MMAING performed on par with EARS C2. Notably, the MMAING ensemble accurately predicted the onset of the four waves of the COVID-19 period in Brazil, further validating its effectiveness in real-world scenarios.
Conclusion: Identifying trends in time series data related to primary healthcare encounters indicated the possibility of developing a reliable method for the early detection of outbreaks. MMAING demonstrated consistent identification capabilities across various scenarios, outperforming established reference methods.
期刊介绍:
BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.