{"title":"Machine Learning for Prediction of Non-Small Cell Lung Cancer Based on Inflammatory and Nutritional Indicators in Adults: A Cross-Sectional Study","authors":"Qiaoli Wang, Tao Liang, Yuexi Li, Xiaoqin Liu","doi":"10.2147/cmar.s454638","DOIUrl":null,"url":null,"abstract":"<strong>Purpose:</strong> The aim of this study was to evaluate the potential benefit of blood inflammation in the diagnosis of non-small cell lung cancer (NSCLC) and propose a machine-learning-based method to predict NSCLC in asymptomatic adults.<br/><strong>Patients and Methods:</strong> A cross-sectional study was evaluated using medical records of 139 patients with non-small cell lung cancer and physical examination data from May 2022 to May 2023 of 198 healthy controls. The NSCLC cohort comprised 128 cases of adenocarcinoma, 3 cases of squamous cell carcinoma, and 8 cases of other NSCLC subtypes. The correlation between inflammatory and nutritional markers, such as monocytes, neutrophils, LMR, NLR, PLR, PHR and non-small cell lung cancer was examined. Features were selected using Python’s feature selection library and analyzed by five algorithms. The predictive ability of the model for non-small cell lung cancer diagnosis was assessed by precision, accuracy, recall, F1 score, and area under the curve (AUC).<br/><strong>Results:</strong> The results showed that the top 14 important factors were PDW, age, TP, RBC, HGB, LYM, LYM%, RDW, PLR, LMR, PHR, MONO, MONO%, gender. Additionally, the naive Bayes (NB) algorithm demonstrated the highest overall performance in predicting adult NSCLC among the five machine learning algorithms, achieving an accuracy of 0.87, a macro average F1 score of 0.85, a weighted average F1 score of 0.87, and an AUC of 0.84.<br/><strong>Conclusion:</strong> In feature ranking, platelet distribution width was the most important feature, and the NB algorithm performed best in predicting adult NSCLC diagnosis.<br/><br/><strong>Keywords:</strong> machine learning, non-small cell lung cancer, inflammatory indicators, nutritional indicators, ratio, diagnosis<br/>","PeriodicalId":9479,"journal":{"name":"Cancer Management and Research","volume":"36 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Management and Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/cmar.s454638","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The aim of this study was to evaluate the potential benefit of blood inflammation in the diagnosis of non-small cell lung cancer (NSCLC) and propose a machine-learning-based method to predict NSCLC in asymptomatic adults. Patients and Methods: A cross-sectional study was evaluated using medical records of 139 patients with non-small cell lung cancer and physical examination data from May 2022 to May 2023 of 198 healthy controls. The NSCLC cohort comprised 128 cases of adenocarcinoma, 3 cases of squamous cell carcinoma, and 8 cases of other NSCLC subtypes. The correlation between inflammatory and nutritional markers, such as monocytes, neutrophils, LMR, NLR, PLR, PHR and non-small cell lung cancer was examined. Features were selected using Python’s feature selection library and analyzed by five algorithms. The predictive ability of the model for non-small cell lung cancer diagnosis was assessed by precision, accuracy, recall, F1 score, and area under the curve (AUC). Results: The results showed that the top 14 important factors were PDW, age, TP, RBC, HGB, LYM, LYM%, RDW, PLR, LMR, PHR, MONO, MONO%, gender. Additionally, the naive Bayes (NB) algorithm demonstrated the highest overall performance in predicting adult NSCLC among the five machine learning algorithms, achieving an accuracy of 0.87, a macro average F1 score of 0.85, a weighted average F1 score of 0.87, and an AUC of 0.84. Conclusion: In feature ranking, platelet distribution width was the most important feature, and the NB algorithm performed best in predicting adult NSCLC diagnosis.
期刊介绍:
Cancer Management and Research is an international, peer reviewed, open access journal focusing on cancer research and the optimal use of preventative and integrated treatment interventions to achieve improved outcomes, enhanced survival, and quality of life for cancer patients. Specific topics covered in the journal include:
◦Epidemiology, detection and screening
◦Cellular research and biomarkers
◦Identification of biotargets and agents with novel mechanisms of action
◦Optimal clinical use of existing anticancer agents, including combination therapies
◦Radiation and surgery
◦Palliative care
◦Patient adherence, quality of life, satisfaction
The journal welcomes submitted papers covering original research, basic science, clinical & epidemiological studies, reviews & evaluations, guidelines, expert opinion and commentary, and case series that shed novel insights on a disease or disease subtype.