Yang Yan, Zhong Chen, Xinglei Shen, Ronald C Chen, Hao Gao
{"title":"放疗患者每周报告的短期和长期预后预测:单患者时间序列模型与基于变压器的多患者时间序列模型","authors":"Yang Yan, Zhong Chen, Xinglei Shen, Ronald C Chen, Hao Gao","doi":"10.1186/s13040-025-00464-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Patient-reported outcomes (PROs) are direct reports from patients on health status, symptoms, quality of life, or treatment satisfaction, offering critical insights into subjective experiences that clinical metrics may overlook. Accurately predicting personalized short- and long-term weekly PROs during radiotherapy is essential for monitoring health status, optimizing treatment efficacy, and enabling timely interventions to manage side effects.</p><p><strong>Methods: </strong>Based on the well-documented prostate cancer PRO dataset with 17 patients after pre-processing, this study evaluates single-patient time series models (i.e., vector autoregression (VAR) and VAR with incremental ground truth PRO data (VAR-Inc)) and a transformer-based multi-patient model (i.e., Temporal Fusion Transformer (TFT)) for short- and long-term weekly PRO prediction. VAR-Inc integrates follow-up PRO data to refine predictions, while TFT leverages multi-patient heterogeneous information to capture complex temporal patterns.</p><p><strong>Results: </strong>Key experimental results on prostate cancer patients demonstrate that (1) VAR-Inc demonstrated superior performance (lower MAE/RMSE) over VAR, highlighting the importance of incremental PRO updates. (2) TFT significantly outperformed both VAR models in long-term prediction, with statistical significance, by utilizing multi-patient data. (3) TFT effectively captured weekly PRO trends and variations, aligning closely with ground truth. (4) Unlike single-patient models, TFT built robust predictive frameworks by integrating cross-patient similarities and complementary patients' PRO information. VAR-Inc's performance deteriorated with missing follow-up PROs, whereas TFT remained stable, overcoming this limitation. On average, TFT outperforms VAR and VAR-Inc by achieving a lowest MAE 0.7715, while the MAE of VAR and VAR-Inc are 1.1329 and 0.8089, respectively. Furthermore, TFT is superior to VAR and VAR-Inc by achieving a lowest RMSE 0.9586, while the RMSE of VAR and VAR-Inc are 1.4817 and 1.0693, respectively.</p><p><strong>Conclusion: </strong>TFT emerges as a reliable approach for PRO prediction, excelling in long-term accuracy, trend capture, and resilience to data gaps by leveraging multi-patient information. Its ability to synthesize heterogeneous PRO data offers advantages over single-patient models, supporting personalized treatment adaptation and informed clinical decision-making. This underscores the potential of transformer-based models in enhancing PRO-driven radiotherapy management.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"53"},"PeriodicalIF":6.1000,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341308/pdf/","citationCount":"0","resultStr":"{\"title\":\"Short- and long-term weekly patient-reported outcomes prediction undergoing radiotherapy: single-patient time series model vs. transformer-based multi-patient time series model.\",\"authors\":\"Yang Yan, Zhong Chen, Xinglei Shen, Ronald C Chen, Hao Gao\",\"doi\":\"10.1186/s13040-025-00464-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Patient-reported outcomes (PROs) are direct reports from patients on health status, symptoms, quality of life, or treatment satisfaction, offering critical insights into subjective experiences that clinical metrics may overlook. Accurately predicting personalized short- and long-term weekly PROs during radiotherapy is essential for monitoring health status, optimizing treatment efficacy, and enabling timely interventions to manage side effects.</p><p><strong>Methods: </strong>Based on the well-documented prostate cancer PRO dataset with 17 patients after pre-processing, this study evaluates single-patient time series models (i.e., vector autoregression (VAR) and VAR with incremental ground truth PRO data (VAR-Inc)) and a transformer-based multi-patient model (i.e., Temporal Fusion Transformer (TFT)) for short- and long-term weekly PRO prediction. VAR-Inc integrates follow-up PRO data to refine predictions, while TFT leverages multi-patient heterogeneous information to capture complex temporal patterns.</p><p><strong>Results: </strong>Key experimental results on prostate cancer patients demonstrate that (1) VAR-Inc demonstrated superior performance (lower MAE/RMSE) over VAR, highlighting the importance of incremental PRO updates. (2) TFT significantly outperformed both VAR models in long-term prediction, with statistical significance, by utilizing multi-patient data. (3) TFT effectively captured weekly PRO trends and variations, aligning closely with ground truth. (4) Unlike single-patient models, TFT built robust predictive frameworks by integrating cross-patient similarities and complementary patients' PRO information. VAR-Inc's performance deteriorated with missing follow-up PROs, whereas TFT remained stable, overcoming this limitation. On average, TFT outperforms VAR and VAR-Inc by achieving a lowest MAE 0.7715, while the MAE of VAR and VAR-Inc are 1.1329 and 0.8089, respectively. Furthermore, TFT is superior to VAR and VAR-Inc by achieving a lowest RMSE 0.9586, while the RMSE of VAR and VAR-Inc are 1.4817 and 1.0693, respectively.</p><p><strong>Conclusion: </strong>TFT emerges as a reliable approach for PRO prediction, excelling in long-term accuracy, trend capture, and resilience to data gaps by leveraging multi-patient information. Its ability to synthesize heterogeneous PRO data offers advantages over single-patient models, supporting personalized treatment adaptation and informed clinical decision-making. This underscores the potential of transformer-based models in enhancing PRO-driven radiotherapy management.</p>\",\"PeriodicalId\":48947,\"journal\":{\"name\":\"Biodata Mining\",\"volume\":\"18 1\",\"pages\":\"53\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2025-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341308/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodata Mining\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13040-025-00464-7\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-025-00464-7","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Short- and long-term weekly patient-reported outcomes prediction undergoing radiotherapy: single-patient time series model vs. transformer-based multi-patient time series model.
Background: Patient-reported outcomes (PROs) are direct reports from patients on health status, symptoms, quality of life, or treatment satisfaction, offering critical insights into subjective experiences that clinical metrics may overlook. Accurately predicting personalized short- and long-term weekly PROs during radiotherapy is essential for monitoring health status, optimizing treatment efficacy, and enabling timely interventions to manage side effects.
Methods: Based on the well-documented prostate cancer PRO dataset with 17 patients after pre-processing, this study evaluates single-patient time series models (i.e., vector autoregression (VAR) and VAR with incremental ground truth PRO data (VAR-Inc)) and a transformer-based multi-patient model (i.e., Temporal Fusion Transformer (TFT)) for short- and long-term weekly PRO prediction. VAR-Inc integrates follow-up PRO data to refine predictions, while TFT leverages multi-patient heterogeneous information to capture complex temporal patterns.
Results: Key experimental results on prostate cancer patients demonstrate that (1) VAR-Inc demonstrated superior performance (lower MAE/RMSE) over VAR, highlighting the importance of incremental PRO updates. (2) TFT significantly outperformed both VAR models in long-term prediction, with statistical significance, by utilizing multi-patient data. (3) TFT effectively captured weekly PRO trends and variations, aligning closely with ground truth. (4) Unlike single-patient models, TFT built robust predictive frameworks by integrating cross-patient similarities and complementary patients' PRO information. VAR-Inc's performance deteriorated with missing follow-up PROs, whereas TFT remained stable, overcoming this limitation. On average, TFT outperforms VAR and VAR-Inc by achieving a lowest MAE 0.7715, while the MAE of VAR and VAR-Inc are 1.1329 and 0.8089, respectively. Furthermore, TFT is superior to VAR and VAR-Inc by achieving a lowest RMSE 0.9586, while the RMSE of VAR and VAR-Inc are 1.4817 and 1.0693, respectively.
Conclusion: TFT emerges as a reliable approach for PRO prediction, excelling in long-term accuracy, trend capture, and resilience to data gaps by leveraging multi-patient information. Its ability to synthesize heterogeneous PRO data offers advantages over single-patient models, supporting personalized treatment adaptation and informed clinical decision-making. This underscores the potential of transformer-based models in enhancing PRO-driven radiotherapy management.
期刊介绍:
BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data.
Topical areas include, but are not limited to:
-Development, evaluation, and application of novel data mining and machine learning algorithms.
-Adaptation, evaluation, and application of traditional data mining and machine learning algorithms.
-Open-source software for the application of data mining and machine learning algorithms.
-Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies.
-Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.