{"title":"利用机器学习技术预测宫颈癌患者的生存期","authors":"Intorn Chanudom, Ekkasit Tharavichitkul, Wimalin Laosiritaworn","doi":"10.4258/hir.2024.30.1.60","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this research is to apply machine learning (ML) algorithms to predict the survival of cervical cancer patients. The aim was to address the limitations of traditional statistical methods, which often fail to provide accurate answers due to the complexity of the problem.</p><p><strong>Methods: </strong>This research employed visualization techniques for initial data understanding. Subsequently, ML algorithms were used to develop both classification and regression models for survival prediction. In the classification models, we trained the algorithms to predict the time interval between the initial diagnosis and the patient's death. The intervals were categorized as \"<6 months,\" \"6 months to 3 years,\" \"3 years to 5 years,\" and \">5 years.\" The regression model aimed to predict survival time (in months). We used attribute weights to gain insights into the model, highlighting features with a significant impact on predictions and offering valuable insights into the model's behavior and decision-making process.</p><p><strong>Results: </strong>The gradient boosting trees algorithm achieved an 81.55% accuracy in the classification model, while the random forest algorithm excelled in the regression model, with a root mean square error of 22.432. Notably, radiation doses around the affected areas significantly influenced survival duration.</p><p><strong>Conclusions: </strong>Machine learning demonstrated the ability to provide high-accuracy predictions of survival periods in both classification and regression problems. This suggests its potential use as a decision-support tool in the process of treatment planning and resource allocation for each patient.</p>","PeriodicalId":12947,"journal":{"name":"Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10879821/pdf/","citationCount":"0","resultStr":"{\"title\":\"Prediction of Cervical Cancer Patients' Survival Period with Machine Learning Techniques.\",\"authors\":\"Intorn Chanudom, Ekkasit Tharavichitkul, Wimalin Laosiritaworn\",\"doi\":\"10.4258/hir.2024.30.1.60\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>The objective of this research is to apply machine learning (ML) algorithms to predict the survival of cervical cancer patients. The aim was to address the limitations of traditional statistical methods, which often fail to provide accurate answers due to the complexity of the problem.</p><p><strong>Methods: </strong>This research employed visualization techniques for initial data understanding. Subsequently, ML algorithms were used to develop both classification and regression models for survival prediction. In the classification models, we trained the algorithms to predict the time interval between the initial diagnosis and the patient's death. The intervals were categorized as \\\"<6 months,\\\" \\\"6 months to 3 years,\\\" \\\"3 years to 5 years,\\\" and \\\">5 years.\\\" The regression model aimed to predict survival time (in months). We used attribute weights to gain insights into the model, highlighting features with a significant impact on predictions and offering valuable insights into the model's behavior and decision-making process.</p><p><strong>Results: </strong>The gradient boosting trees algorithm achieved an 81.55% accuracy in the classification model, while the random forest algorithm excelled in the regression model, with a root mean square error of 22.432. Notably, radiation doses around the affected areas significantly influenced survival duration.</p><p><strong>Conclusions: </strong>Machine learning demonstrated the ability to provide high-accuracy predictions of survival periods in both classification and regression problems. This suggests its potential use as a decision-support tool in the process of treatment planning and resource allocation for each patient.</p>\",\"PeriodicalId\":12947,\"journal\":{\"name\":\"Healthcare Informatics Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10879821/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare Informatics Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4258/hir.2024.30.1.60\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4258/hir.2024.30.1.60","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
摘要
研究目的本研究的目的是应用机器学习(ML)算法预测宫颈癌患者的生存率。由于问题的复杂性,传统的统计方法往往无法提供准确的答案:本研究采用了可视化技术来初步了解数据。随后,我们使用 ML 算法开发了用于生存预测的分类和回归模型。在分类模型中,我们训练算法来预测从最初诊断到患者死亡之间的时间间隔。时间间隔被归类为 "5 年"。回归模型旨在预测生存时间(以月为单位)。我们使用属性权重来深入了解模型,突出对预测有重大影响的特征,并对模型的行为和决策过程提供有价值的见解:梯度提升树算法在分类模型中达到了 81.55% 的准确率,而随机森林算法在回归模型中表现出色,均方根误差为 22.432。值得注意的是,患区周围的辐射剂量对存活时间有显著影响:机器学习在分类和回归问题上都表现出了高精度预测存活期的能力。结论:机器学习在分类和回归问题上都能提供高精度的存活期预测,这表明它有可能作为决策支持工具,用于每位患者的治疗规划和资源分配。
Prediction of Cervical Cancer Patients' Survival Period with Machine Learning Techniques.
Objectives: The objective of this research is to apply machine learning (ML) algorithms to predict the survival of cervical cancer patients. The aim was to address the limitations of traditional statistical methods, which often fail to provide accurate answers due to the complexity of the problem.
Methods: This research employed visualization techniques for initial data understanding. Subsequently, ML algorithms were used to develop both classification and regression models for survival prediction. In the classification models, we trained the algorithms to predict the time interval between the initial diagnosis and the patient's death. The intervals were categorized as "<6 months," "6 months to 3 years," "3 years to 5 years," and ">5 years." The regression model aimed to predict survival time (in months). We used attribute weights to gain insights into the model, highlighting features with a significant impact on predictions and offering valuable insights into the model's behavior and decision-making process.
Results: The gradient boosting trees algorithm achieved an 81.55% accuracy in the classification model, while the random forest algorithm excelled in the regression model, with a root mean square error of 22.432. Notably, radiation doses around the affected areas significantly influenced survival duration.
Conclusions: Machine learning demonstrated the ability to provide high-accuracy predictions of survival periods in both classification and regression problems. This suggests its potential use as a decision-support tool in the process of treatment planning and resource allocation for each patient.