Qiang Li;Dongchen Li;Weizhi Nie;He Jiao;Zhenhua Wu;Anan Liu
{"title":"通过因果解缠在脓毒症早期预测中的时空分析","authors":"Qiang Li;Dongchen Li;Weizhi Nie;He Jiao;Zhenhua Wu;Anan Liu","doi":"10.1109/TKDE.2025.3569584","DOIUrl":null,"url":null,"abstract":"Sepsis is one of the main causes of death in ICU patients, and accurate and stable early prediction is essential for clinical intervention. Existing methods mostly rely on traditional time series models (e.g., LSTM, Transformer) or clinical scoring criteria (e.g., SOFA, qSOFA), but face two major challenges: 1) spurious correlations in the data affect the robustness of the model; 2) Lack of modeling the underlying causal relationships in the data space. We propose a Serialized Causal Disentanglement Model (SCDM) that decouples latent variables into sepsis-related factors (<inline-formula><tex-math>$u$</tex-math></inline-formula>), other disease-related factors (<inline-formula><tex-math>$v$</tex-math></inline-formula>), and irrelevant confounders (<inline-formula><tex-math>$s$</tex-math></inline-formula> ). Based on the MIMIC-IV v2.2 dataset (3,511 positive samples and 17,538 negative samples), SCDM took patient clinical indicators, personal information, and clinical notes as input, and achieved an AUC of 0.765-0.928in the prediction task 48 to 0 hours before the onset of sepsis. The performance is significantly better than the baseline models (e.g., Transformer's 0.662-0.910, MGP-AttTCN's 0.692-0.913). Experiments show that optimizing the time window (5 hours of continuous observation) and variable selection (45 key indicators) can improve the performance of the model. The effectiveness of causal unwinding is verified by the visualization of Grad CAM and t-SNE, key clinical indicators such as platelet count, lactic acid, and respiratory rate are further identified to provide interpretable decision support for doctors. Our study provides a high-precision and interpretable causal disentanglement framework for early prediction of sepsis, which is expected to promote the development of intelligent diagnosis and treatment in the ICU.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 8","pages":"4860-4872"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Temporal and Spatial Analysis in Early Sepsis Prediction via Causal Disentanglements\",\"authors\":\"Qiang Li;Dongchen Li;Weizhi Nie;He Jiao;Zhenhua Wu;Anan Liu\",\"doi\":\"10.1109/TKDE.2025.3569584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sepsis is one of the main causes of death in ICU patients, and accurate and stable early prediction is essential for clinical intervention. Existing methods mostly rely on traditional time series models (e.g., LSTM, Transformer) or clinical scoring criteria (e.g., SOFA, qSOFA), but face two major challenges: 1) spurious correlations in the data affect the robustness of the model; 2) Lack of modeling the underlying causal relationships in the data space. We propose a Serialized Causal Disentanglement Model (SCDM) that decouples latent variables into sepsis-related factors (<inline-formula><tex-math>$u$</tex-math></inline-formula>), other disease-related factors (<inline-formula><tex-math>$v$</tex-math></inline-formula>), and irrelevant confounders (<inline-formula><tex-math>$s$</tex-math></inline-formula> ). Based on the MIMIC-IV v2.2 dataset (3,511 positive samples and 17,538 negative samples), SCDM took patient clinical indicators, personal information, and clinical notes as input, and achieved an AUC of 0.765-0.928in the prediction task 48 to 0 hours before the onset of sepsis. The performance is significantly better than the baseline models (e.g., Transformer's 0.662-0.910, MGP-AttTCN's 0.692-0.913). Experiments show that optimizing the time window (5 hours of continuous observation) and variable selection (45 key indicators) can improve the performance of the model. The effectiveness of causal unwinding is verified by the visualization of Grad CAM and t-SNE, key clinical indicators such as platelet count, lactic acid, and respiratory rate are further identified to provide interpretable decision support for doctors. Our study provides a high-precision and interpretable causal disentanglement framework for early prediction of sepsis, which is expected to promote the development of intelligent diagnosis and treatment in the ICU.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 8\",\"pages\":\"4860-4872\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11018491/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11018491/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Temporal and Spatial Analysis in Early Sepsis Prediction via Causal Disentanglements
Sepsis is one of the main causes of death in ICU patients, and accurate and stable early prediction is essential for clinical intervention. Existing methods mostly rely on traditional time series models (e.g., LSTM, Transformer) or clinical scoring criteria (e.g., SOFA, qSOFA), but face two major challenges: 1) spurious correlations in the data affect the robustness of the model; 2) Lack of modeling the underlying causal relationships in the data space. We propose a Serialized Causal Disentanglement Model (SCDM) that decouples latent variables into sepsis-related factors ($u$), other disease-related factors ($v$), and irrelevant confounders ($s$ ). Based on the MIMIC-IV v2.2 dataset (3,511 positive samples and 17,538 negative samples), SCDM took patient clinical indicators, personal information, and clinical notes as input, and achieved an AUC of 0.765-0.928in the prediction task 48 to 0 hours before the onset of sepsis. The performance is significantly better than the baseline models (e.g., Transformer's 0.662-0.910, MGP-AttTCN's 0.692-0.913). Experiments show that optimizing the time window (5 hours of continuous observation) and variable selection (45 key indicators) can improve the performance of the model. The effectiveness of causal unwinding is verified by the visualization of Grad CAM and t-SNE, key clinical indicators such as platelet count, lactic acid, and respiratory rate are further identified to provide interpretable decision support for doctors. Our study provides a high-precision and interpretable causal disentanglement framework for early prediction of sepsis, which is expected to promote the development of intelligent diagnosis and treatment in the ICU.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.