Xianfei Ye, Xinfeng Zhao, Yinyu Lou, Hanqi Pan, Yunying Chen
{"title":"具有体液参数的机器学习算法:脑脊液中恶性细胞筛选的可解释框架。","authors":"Xianfei Ye, Xinfeng Zhao, Yinyu Lou, Hanqi Pan, Yunying Chen","doi":"10.1515/cclm-2025-0302","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to develop and validate a machine learning (ML) model utilizing cerebrospinal fluid (CSF) body fluid parameters from hematology analyzers to screen for malignant cells.</p><p><strong>Methods: </strong>We analyzed 643 consecutive CSF samples from patients with central nervous system symptoms, with 191 samples classified as positive for malignant cells based on cytological examination, for model derivation. Body fluid parameters were measured using the body fluid mode of a hematology analyzer. Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to identify predictive biomarkers, followed by performance evaluations of six ML algorithms. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). The selected model was also externally validated with an additional 136 CSF samples.</p><p><strong>Results: </strong>The median leukocyte (WBC) and total nucleated cell (TNC) counts in the cytology-positive samples were significantly lower than those in the cytology-negative samples (5.4 vs. 31.8 and 7.4 vs. 32.6, respectively, p<0.001). The support vector machine (SVM) model achieved the highest area under the curve (AUC) of 0.899 (SD: 0.035) and the highest sensitivity of 0.827 (SD: 0.059) in internal validation. SHAP analysis identified the percentage of high fluorescence cells and monocytes as the two most significant predictors, both positively correlated with malignant cell outcomes. External validation demonstrated a comparable AUC and sensitivity, confirming the model's generalizability.</p><p><strong>Conclusions: </strong>We developed an ML model that predicts cytological outcomes in CSF using routinely available body fluid parameters. The model demonstrated consistent performance during external validation.</p>","PeriodicalId":10390,"journal":{"name":"Clinical chemistry and laboratory medicine","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning algorithms with body fluid parameters: an interpretable framework for malignant cell screening in cerebrospinal fluid.\",\"authors\":\"Xianfei Ye, Xinfeng Zhao, Yinyu Lou, Hanqi Pan, Yunying Chen\",\"doi\":\"10.1515/cclm-2025-0302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>This study aimed to develop and validate a machine learning (ML) model utilizing cerebrospinal fluid (CSF) body fluid parameters from hematology analyzers to screen for malignant cells.</p><p><strong>Methods: </strong>We analyzed 643 consecutive CSF samples from patients with central nervous system symptoms, with 191 samples classified as positive for malignant cells based on cytological examination, for model derivation. Body fluid parameters were measured using the body fluid mode of a hematology analyzer. Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to identify predictive biomarkers, followed by performance evaluations of six ML algorithms. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). The selected model was also externally validated with an additional 136 CSF samples.</p><p><strong>Results: </strong>The median leukocyte (WBC) and total nucleated cell (TNC) counts in the cytology-positive samples were significantly lower than those in the cytology-negative samples (5.4 vs. 31.8 and 7.4 vs. 32.6, respectively, p<0.001). The support vector machine (SVM) model achieved the highest area under the curve (AUC) of 0.899 (SD: 0.035) and the highest sensitivity of 0.827 (SD: 0.059) in internal validation. SHAP analysis identified the percentage of high fluorescence cells and monocytes as the two most significant predictors, both positively correlated with malignant cell outcomes. External validation demonstrated a comparable AUC and sensitivity, confirming the model's generalizability.</p><p><strong>Conclusions: </strong>We developed an ML model that predicts cytological outcomes in CSF using routinely available body fluid parameters. The model demonstrated consistent performance during external validation.</p>\",\"PeriodicalId\":10390,\"journal\":{\"name\":\"Clinical chemistry and laboratory medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical chemistry and laboratory medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1515/cclm-2025-0302\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICAL LABORATORY TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical chemistry and laboratory medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1515/cclm-2025-0302","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
目的:本研究旨在开发和验证机器学习(ML)模型,利用血液学分析仪的脑脊液(CSF)体液参数筛选恶性细胞。方法:我们分析了643例中枢神经系统症状患者的连续脑脊液样本,其中191例样本经细胞学检查为恶性细胞阳性,用于模型推导。使用血液学分析仪的体液模式测量体液参数。最小绝对收缩和选择算子(LASSO)回归应用于识别预测性生物标志物,然后对六种ML算法进行性能评估。采用SHapley加性解释(SHAP)评价模型可解释性。选择的模型还通过额外的136个CSF样本进行外部验证。结果:细胞学阳性样本的中位白细胞(WBC)和总有核细胞(TNC)计数明显低于细胞学阴性样本的中位白细胞(WBC)和总有核细胞(TNC)计数(分别为5.4 vs. 31.8和7.4 vs. 32.6)。结论:我们建立了一个ML模型,利用常规体液参数预测脑脊液细胞学结果。该模型在外部验证期间表现出一致的性能。
Machine learning algorithms with body fluid parameters: an interpretable framework for malignant cell screening in cerebrospinal fluid.
Objectives: This study aimed to develop and validate a machine learning (ML) model utilizing cerebrospinal fluid (CSF) body fluid parameters from hematology analyzers to screen for malignant cells.
Methods: We analyzed 643 consecutive CSF samples from patients with central nervous system symptoms, with 191 samples classified as positive for malignant cells based on cytological examination, for model derivation. Body fluid parameters were measured using the body fluid mode of a hematology analyzer. Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to identify predictive biomarkers, followed by performance evaluations of six ML algorithms. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). The selected model was also externally validated with an additional 136 CSF samples.
Results: The median leukocyte (WBC) and total nucleated cell (TNC) counts in the cytology-positive samples were significantly lower than those in the cytology-negative samples (5.4 vs. 31.8 and 7.4 vs. 32.6, respectively, p<0.001). The support vector machine (SVM) model achieved the highest area under the curve (AUC) of 0.899 (SD: 0.035) and the highest sensitivity of 0.827 (SD: 0.059) in internal validation. SHAP analysis identified the percentage of high fluorescence cells and monocytes as the two most significant predictors, both positively correlated with malignant cell outcomes. External validation demonstrated a comparable AUC and sensitivity, confirming the model's generalizability.
Conclusions: We developed an ML model that predicts cytological outcomes in CSF using routinely available body fluid parameters. The model demonstrated consistent performance during external validation.
期刊介绍:
Clinical Chemistry and Laboratory Medicine (CCLM) publishes articles on novel teaching and training methods applicable to laboratory medicine. CCLM welcomes contributions on the progress in fundamental and applied research and cutting-edge clinical laboratory medicine. It is one of the leading journals in the field, with an impact factor over 3. CCLM is issued monthly, and it is published in print and electronically.
CCLM is the official journal of the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) and publishes regularly EFLM recommendations and news. CCLM is the official journal of the National Societies from Austria (ÖGLMKC); Belgium (RBSLM); Germany (DGKL); Hungary (MLDT); Ireland (ACBI); Italy (SIBioC); Portugal (SPML); and Slovenia (SZKK); and it is affiliated to AACB (Australia) and SFBC (France).
Topics:
- clinical biochemistry
- clinical genomics and molecular biology
- clinical haematology and coagulation
- clinical immunology and autoimmunity
- clinical microbiology
- drug monitoring and analysis
- evaluation of diagnostic biomarkers
- disease-oriented topics (cardiovascular disease, cancer diagnostics, diabetes)
- new reagents, instrumentation and technologies
- new methodologies
- reference materials and methods
- reference values and decision limits
- quality and safety in laboratory medicine
- translational laboratory medicine
- clinical metrology
Follow @cclm_degruyter on Twitter!