Using Machine Learning Models to Diagnose Chronic Rhinosinusitis: Analysis of Pre-Treatment Patient-Generated Health Data to Predict Cardinal Symptoms and Sinonasal Inflammation.
Arun M Raghavan, Mohamed A Aboueisha, Ion Prohnitchi, David J Cvancara, Ian M Humphreys, Aria Jafari, Waleed M Abuzeid
{"title":"Using Machine Learning Models to Diagnose Chronic Rhinosinusitis: Analysis of Pre-Treatment Patient-Generated Health Data to Predict Cardinal Symptoms and Sinonasal Inflammation.","authors":"Arun M Raghavan, Mohamed A Aboueisha, Ion Prohnitchi, David J Cvancara, Ian M Humphreys, Aria Jafari, Waleed M Abuzeid","doi":"10.1177/19458924251322081","DOIUrl":null,"url":null,"abstract":"<p><p>BackgroundThe diagnosis of chronic rhinosinusitis (CRS) relies upon patient-reported symptoms and objective nasal endoscopy and/or computed tomography (CT) findings. Many patients, at the time of evaluation by an otolaryngologist or rhinologist, lack objective findings confirming CRS and do not have this disease.ObjectiveWe hypothesized that a machine learning model (MLM) could predict probable CRS using patient-reported data acquired prior to rhinologist-directed treatment. We leveraged patient-generated health data using a machine learning approach to predict: (1) the primary endpoint of sinonasal inflammation on CT evidenced by a Lund-Mackay score (LMS) ≥ 5 and (2) the secondary endpoint of LMS ≥ 5 and ≥2 cardinal symptoms of CRS.Methods543 patients were evaluated at a tertiary care rhinology clinic and subsequently underwent CT imaging with LMS. Patient-reported outcome measures and additional patient data were collected via an electronic platform prior to in-person evaluation. Three MLMs, a random forest classifier, a deep neural network, and an extreme gradient Boost (XGBoost) algorithm, were trained on predictors drawn from patient-generated health data and tested on a naïve test set (90:10 training:test set split). Cross-validation was executed, and model performance compared between algorithms and with linear regression techniques.Results57 predictors were extracted from the patient-generated health data. The best model (XGBoost) achieved an area-under-the-curve (AUC) of 71.3% (accuracy 74.5%, sensitivity 38.9%, specificity 91.9%) in predicting the primary endpoint, and an AUC of 79.8% (accuracy 85.5%, sensitivity 36.4%, specificity 97.7%) in predicting the secondary endpoint. This exceeded the performance of a linear regression model.ConclusionA MLM using patient-generated health data accurately predicted patients with probable CRS (≥2 cardinal symptoms and LMS ≥ 5). With further validation on a larger cohort, such a tool could potentially be used by otolaryngologists to inform clinical utility of diagnostic imaging and for screening prior to subspecialty Rhinology referral.</p>","PeriodicalId":7650,"journal":{"name":"American Journal of Rhinology & Allergy","volume":" ","pages":"19458924251322081"},"PeriodicalIF":2.5000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Rhinology & Allergy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/19458924251322081","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
BackgroundThe diagnosis of chronic rhinosinusitis (CRS) relies upon patient-reported symptoms and objective nasal endoscopy and/or computed tomography (CT) findings. Many patients, at the time of evaluation by an otolaryngologist or rhinologist, lack objective findings confirming CRS and do not have this disease.ObjectiveWe hypothesized that a machine learning model (MLM) could predict probable CRS using patient-reported data acquired prior to rhinologist-directed treatment. We leveraged patient-generated health data using a machine learning approach to predict: (1) the primary endpoint of sinonasal inflammation on CT evidenced by a Lund-Mackay score (LMS) ≥ 5 and (2) the secondary endpoint of LMS ≥ 5 and ≥2 cardinal symptoms of CRS.Methods543 patients were evaluated at a tertiary care rhinology clinic and subsequently underwent CT imaging with LMS. Patient-reported outcome measures and additional patient data were collected via an electronic platform prior to in-person evaluation. Three MLMs, a random forest classifier, a deep neural network, and an extreme gradient Boost (XGBoost) algorithm, were trained on predictors drawn from patient-generated health data and tested on a naïve test set (90:10 training:test set split). Cross-validation was executed, and model performance compared between algorithms and with linear regression techniques.Results57 predictors were extracted from the patient-generated health data. The best model (XGBoost) achieved an area-under-the-curve (AUC) of 71.3% (accuracy 74.5%, sensitivity 38.9%, specificity 91.9%) in predicting the primary endpoint, and an AUC of 79.8% (accuracy 85.5%, sensitivity 36.4%, specificity 97.7%) in predicting the secondary endpoint. This exceeded the performance of a linear regression model.ConclusionA MLM using patient-generated health data accurately predicted patients with probable CRS (≥2 cardinal symptoms and LMS ≥ 5). With further validation on a larger cohort, such a tool could potentially be used by otolaryngologists to inform clinical utility of diagnostic imaging and for screening prior to subspecialty Rhinology referral.
期刊介绍:
The American Journal of Rhinology & Allergy is a peer-reviewed, scientific publication committed to expanding knowledge and publishing the best clinical and basic research within the fields of Rhinology & Allergy. Its focus is to publish information which contributes to improved quality of care for patients with nasal and sinus disorders. Its primary readership consists of otolaryngologists, allergists, and plastic surgeons. Published material includes peer-reviewed original research, clinical trials, and review articles.