Alexandre Moura Dos Santos, Samuel Katsuyuki Shinjo
{"title":"Development of machine learning models for chronic fatigue prediction in granulomatosis with polyangiitis.","authors":"Alexandre Moura Dos Santos, Samuel Katsuyuki Shinjo","doi":"10.1186/s42358-025-00482-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Chronic fatigue severely compromises the quality of life in patients with granulomatosis with polyangiitis (GPA). Traditional diagnostic methods are often time-consuming, relying on clinical expertise and detailed questionnaires. This study aimed to develop a machine learning model capable of predicting chronic fatigue in GPA patients based on clinical data, with a particular focus on improving diagnostic capacity in regions with limited access to specialists.</p><p><strong>Methods: </strong>This cross-sectional study collected data on fatigue (measured by the Modified Fatigue Impact Scale, MFIS), functional ability (Health Assessment Questionnaire, HAQ), disease activity (Birmingham Vasculitis Activity Score, BVAS), comorbidities, medication use, physical activity (International Physical Activity Questionnaire - Short Form, IPAQ-SF), and demographic characteristics. Four machine learning algorithms-logistic regression, decision tree, random forest, and extreme gradient boosting-were assessed using a 70/30 train-test split. Model performance was evaluated using area under the curve (AUC), accuracy, F1 score, recall, and precision. Statistical comparisons were performed using Welch's t-test and the Wilcoxon-Mann-Whitney U test for continuous variables, while the chi-square test or Fisher's exact test was applied to categorical variables, with significance set at P < 0.05. All analyses were conducted using R version 4.4.1 for Windows.</p><p><strong>Results: </strong>Forty-five patients were assessed: 62.2% were female, with a median BMI of 27.72 kg/m² (23.2-30.1), a median age of 55.5 years, and a median disease duration of 12.0 years (6.0-17.0). Fatigue was reported by 20 patients (MFIS score ≥ 38), and seven patients (15.5%) had active disease according to the BVAS, which was similar between the fatigued and no fatigued groups (P > 0.05). The fatigued group had more acute-phase reactants and prednisone use (P < 0.05). The tree-based models achieved an AUC of approximately 0.80, outperforming the other models.</p><p><strong>Conclusion: </strong>Tree-based models demonstrated superior predictive performance in identifying chronic fatigue. The Random Forest model, in particular, highlighted higher disability in activities of daily living (HAQ), older age, and longer disease duration as key predictors. Although the models performed well, additional data and incorporation of clinically relevant variables may further enhance predictive accuracy. Patients with GPA who experienced fatigue showed higher glucocorticoid use and elevated acute-phase reactants, despite similar levels of disease activity, suggesting mechanisms beyond inflammation. Machine learning shows strong potential as a clinical tool for fatigue identification, especially in settings with limited access to specialist care.</p><p><strong>Trial registration: </strong>Universal Trial Number (UTN): U1111-1271-6003; Brazilian Clinical Trials Registry (ReBEC): RBR-9n4z2hh. Registration date: January 18, 2022, and Plataforma Brasil (CAAE # 41762820.1.0000.0068).</p>","PeriodicalId":48634,"journal":{"name":"Advances in Rheumatology","volume":"65 1","pages":"45"},"PeriodicalIF":2.1000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Rheumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s42358-025-00482-3","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Chronic fatigue severely compromises the quality of life in patients with granulomatosis with polyangiitis (GPA). Traditional diagnostic methods are often time-consuming, relying on clinical expertise and detailed questionnaires. This study aimed to develop a machine learning model capable of predicting chronic fatigue in GPA patients based on clinical data, with a particular focus on improving diagnostic capacity in regions with limited access to specialists.
Methods: This cross-sectional study collected data on fatigue (measured by the Modified Fatigue Impact Scale, MFIS), functional ability (Health Assessment Questionnaire, HAQ), disease activity (Birmingham Vasculitis Activity Score, BVAS), comorbidities, medication use, physical activity (International Physical Activity Questionnaire - Short Form, IPAQ-SF), and demographic characteristics. Four machine learning algorithms-logistic regression, decision tree, random forest, and extreme gradient boosting-were assessed using a 70/30 train-test split. Model performance was evaluated using area under the curve (AUC), accuracy, F1 score, recall, and precision. Statistical comparisons were performed using Welch's t-test and the Wilcoxon-Mann-Whitney U test for continuous variables, while the chi-square test or Fisher's exact test was applied to categorical variables, with significance set at P < 0.05. All analyses were conducted using R version 4.4.1 for Windows.
Results: Forty-five patients were assessed: 62.2% were female, with a median BMI of 27.72 kg/m² (23.2-30.1), a median age of 55.5 years, and a median disease duration of 12.0 years (6.0-17.0). Fatigue was reported by 20 patients (MFIS score ≥ 38), and seven patients (15.5%) had active disease according to the BVAS, which was similar between the fatigued and no fatigued groups (P > 0.05). The fatigued group had more acute-phase reactants and prednisone use (P < 0.05). The tree-based models achieved an AUC of approximately 0.80, outperforming the other models.
Conclusion: Tree-based models demonstrated superior predictive performance in identifying chronic fatigue. The Random Forest model, in particular, highlighted higher disability in activities of daily living (HAQ), older age, and longer disease duration as key predictors. Although the models performed well, additional data and incorporation of clinically relevant variables may further enhance predictive accuracy. Patients with GPA who experienced fatigue showed higher glucocorticoid use and elevated acute-phase reactants, despite similar levels of disease activity, suggesting mechanisms beyond inflammation. Machine learning shows strong potential as a clinical tool for fatigue identification, especially in settings with limited access to specialist care.
Trial registration: Universal Trial Number (UTN): U1111-1271-6003; Brazilian Clinical Trials Registry (ReBEC): RBR-9n4z2hh. Registration date: January 18, 2022, and Plataforma Brasil (CAAE # 41762820.1.0000.0068).
期刊介绍:
Formerly named Revista Brasileira de Reumatologia, the journal is celebrating its 60th year of publication.
Advances in Rheumatology is an international, open access journal publishing pre-clinical, translational and clinical studies on all aspects of paediatric and adult rheumatic diseases, including degenerative, inflammatory and autoimmune conditions. The journal is the official publication of the Brazilian Society of Rheumatology and welcomes original research (including systematic reviews and meta-analyses), literature reviews, guidelines and letters arising from published material.