R. Bajaj, Dr. Shandilya, Shivangi Gagneja, Khushi Gupta, Deepak Rawat
{"title":"A Risk Predictive Model for Primary Tumor using Machine Learning with Initial Missing Values","authors":"R. Bajaj, Dr. Shandilya, Shivangi Gagneja, Khushi Gupta, Deepak Rawat","doi":"10.1109/ICDSIS55133.2022.9915957","DOIUrl":null,"url":null,"abstract":"The biological term primary tumor, is growing at the anatomical place where tumor growth began and progressed to produce a malignant mass. The further stage of the primary tumor can lead to cancer. Machine learning assists researchers in identifying and classifying tumors based on growth features, size, speed of spread, and other factors, as well as grouping them based on a comparable set of predicting outcomes. But Missing values in medical data can lead to biased study conclusions and makes it difficult to predict and analyze data with high performance. Therefore, using python KNN imputation was implemented, which sorts multiple complete samples with the nearest measurements using Euclidean distance in the primary tumor missing dataset to find the optimal value of K. After imputing inconsistent data and performing several simulations, the overall performance increased. Hence, this approach may be used to diagnose diseases using more intricate clinical data.","PeriodicalId":178360,"journal":{"name":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSIS55133.2022.9915957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The biological term primary tumor, is growing at the anatomical place where tumor growth began and progressed to produce a malignant mass. The further stage of the primary tumor can lead to cancer. Machine learning assists researchers in identifying and classifying tumors based on growth features, size, speed of spread, and other factors, as well as grouping them based on a comparable set of predicting outcomes. But Missing values in medical data can lead to biased study conclusions and makes it difficult to predict and analyze data with high performance. Therefore, using python KNN imputation was implemented, which sorts multiple complete samples with the nearest measurements using Euclidean distance in the primary tumor missing dataset to find the optimal value of K. After imputing inconsistent data and performing several simulations, the overall performance increased. Hence, this approach may be used to diagnose diseases using more intricate clinical data.