{"title":"血液恶性肿瘤中NF-κB1与细胞因子基因表达的关系分析:利用人工智能和机器学习进行小数据集分析。","authors":"Jae-Seung Jeong, Hyunsu Ju, Chi-Hyun Cho","doi":"10.7150/ijms.109493","DOIUrl":null,"url":null,"abstract":"<p><p>This study measures expression of <i>nuclear factor kappa B</i> (<i>NF-κB</i>)<i>1</i> and related cytokine genes in bone marrow mononuclear cells in patients with hematological malignancies, analyzing the relationship between them with an integrated framework of statistical analyses, machine learning (ML), and explainable artificial intelligence (XAI). While traditional dimensionality reduction techniques-such as principal component analysis, linear discriminant analysis, and t-distributed stochastic neighbor embedding-showed limited differentiation embedding, ML classifiers (k-Nearest Neighbors, Naïve Bayes Classifier, Random Forest, and XGBoost) successfully identified critical patterns. Notably, normalized caspase-1 counts consistently emerged as the most influential feature associated with NF-κB1 activity across disease groups, as highlighted by SHapley Additive exPlanations analyses. Systematic evaluation of ML performance on small datasets revealed that a minimum sample size of 15-24 is necessary for reliable classification outcomes, particularly in cohorts of acute myeloid leukemia and myelodysplastic syndrome. These findings underscore the pivotal role of caspase-1 to the NF-κB1 gene expression in hematologic malignancy diseases. Furthermore, this study demonstrates the feasibility of leveraging ML and XAI to derive meaningful insights from limited data, offering a robust strategy for biomarker discovery and precision medicine in rare hematological malignancies.</p>","PeriodicalId":14031,"journal":{"name":"International Journal of Medical Sciences","volume":"22 9","pages":"2208-2226"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035828/pdf/","citationCount":"0","resultStr":"{\"title\":\"Analysis of the Relationship Between <i>NF-κB1</i> and Cytokine Gene Expression in Hematological Malignancy: Leveraging Explained Artificial Intelligence and Machine Learning for Small Dataset Insights.\",\"authors\":\"Jae-Seung Jeong, Hyunsu Ju, Chi-Hyun Cho\",\"doi\":\"10.7150/ijms.109493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study measures expression of <i>nuclear factor kappa B</i> (<i>NF-κB</i>)<i>1</i> and related cytokine genes in bone marrow mononuclear cells in patients with hematological malignancies, analyzing the relationship between them with an integrated framework of statistical analyses, machine learning (ML), and explainable artificial intelligence (XAI). While traditional dimensionality reduction techniques-such as principal component analysis, linear discriminant analysis, and t-distributed stochastic neighbor embedding-showed limited differentiation embedding, ML classifiers (k-Nearest Neighbors, Naïve Bayes Classifier, Random Forest, and XGBoost) successfully identified critical patterns. Notably, normalized caspase-1 counts consistently emerged as the most influential feature associated with NF-κB1 activity across disease groups, as highlighted by SHapley Additive exPlanations analyses. Systematic evaluation of ML performance on small datasets revealed that a minimum sample size of 15-24 is necessary for reliable classification outcomes, particularly in cohorts of acute myeloid leukemia and myelodysplastic syndrome. These findings underscore the pivotal role of caspase-1 to the NF-κB1 gene expression in hematologic malignancy diseases. Furthermore, this study demonstrates the feasibility of leveraging ML and XAI to derive meaningful insights from limited data, offering a robust strategy for biomarker discovery and precision medicine in rare hematological malignancies.</p>\",\"PeriodicalId\":14031,\"journal\":{\"name\":\"International Journal of Medical Sciences\",\"volume\":\"22 9\",\"pages\":\"2208-2226\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035828/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Medical Sciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.7150/ijms.109493\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.7150/ijms.109493","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
摘要
本研究通过统计分析、机器学习(ML)和可解释人工智能(XAI)的综合框架,检测血液学恶性肿瘤患者骨髓单核细胞中核因子κB (NF-κB)1及相关细胞因子基因的表达,分析二者之间的关系。传统的降维技术(如主成分分析、线性判别分析和t分布随机邻居嵌入)显示有限的分化嵌入,ML分类器(k-Nearest Neighbors, Naïve Bayes Classifier, Random Forest和XGBoost)成功地识别了关键模式。值得注意的是,标准化的caspase-1计数在不同疾病组中始终是与NF-κB1活性相关的最具影响力的特征,正如SHapley加法解释分析所强调的那样。在小数据集上对ML性能的系统评估显示,对于可靠的分类结果来说,最小样本量为15-24是必要的,特别是在急性髓性白血病和骨髓增生异常综合征的队列中。这些发现强调了caspase-1在血液恶性疾病中对NF-κB1基因表达的关键作用。此外,本研究证明了利用ML和XAI从有限的数据中获得有意义的见解的可行性,为罕见血液恶性肿瘤的生物标志物发现和精准医学提供了强有力的策略。
Analysis of the Relationship Between NF-κB1 and Cytokine Gene Expression in Hematological Malignancy: Leveraging Explained Artificial Intelligence and Machine Learning for Small Dataset Insights.
This study measures expression of nuclear factor kappa B (NF-κB)1 and related cytokine genes in bone marrow mononuclear cells in patients with hematological malignancies, analyzing the relationship between them with an integrated framework of statistical analyses, machine learning (ML), and explainable artificial intelligence (XAI). While traditional dimensionality reduction techniques-such as principal component analysis, linear discriminant analysis, and t-distributed stochastic neighbor embedding-showed limited differentiation embedding, ML classifiers (k-Nearest Neighbors, Naïve Bayes Classifier, Random Forest, and XGBoost) successfully identified critical patterns. Notably, normalized caspase-1 counts consistently emerged as the most influential feature associated with NF-κB1 activity across disease groups, as highlighted by SHapley Additive exPlanations analyses. Systematic evaluation of ML performance on small datasets revealed that a minimum sample size of 15-24 is necessary for reliable classification outcomes, particularly in cohorts of acute myeloid leukemia and myelodysplastic syndrome. These findings underscore the pivotal role of caspase-1 to the NF-κB1 gene expression in hematologic malignancy diseases. Furthermore, this study demonstrates the feasibility of leveraging ML and XAI to derive meaningful insights from limited data, offering a robust strategy for biomarker discovery and precision medicine in rare hematological malignancies.
期刊介绍:
Original research papers, reviews, and short research communications in any medical related area can be submitted to the Journal on the understanding that the work has not been published previously in whole or part and is not under consideration for publication elsewhere. Manuscripts in basic science and clinical medicine are both considered. There is no restriction on the length of research papers and reviews, although authors are encouraged to be concise. Short research communication is limited to be under 2500 words.