Milos Savic, V. Kurbalija, Mihailo Ilic, M. Ivanović, D. Jakovetić, A. Valachis, Serge Autexier, Johannes Rust, T. Kosmidis
{"title":"机器学习技术在预测癌症患者生活质量特征中的应用","authors":"Milos Savic, V. Kurbalija, Mihailo Ilic, M. Ivanović, D. Jakovetić, A. Valachis, Serge Autexier, Johannes Rust, T. Kosmidis","doi":"10.2298/csis220227061s","DOIUrl":null,"url":null,"abstract":"Quality of life (QoL) is one of the major issues for cancer patients. With the advent of medical databases containing large amounts of relevant QoL information it becomes possible to train predictive QoL models by machine learning (ML) techniques. However, the training of predictive QoL models poses several challenges mostly due to data privacy concerns and missing values in patient data. In this paper, we analyze several classification and regression ML models predicting QoL indicators for breast and prostate cancer patients. Three different approaches are employed for imputing missing values, and several settings for data privacy preserving are tested. The examined ML models are trained on datasets formed from two databases containing a large number of anonymized medical records of cancer patients from Sweden. Two learning scenarios are considered: centralized and federated learning. In the centralized learning scenario all patient data coming from different data sources is collected at a central location prior to model training. On the other hand, federated learning enables collective training of machine learning models without data sharing. The results of our experimental evaluation show that the predictive power of federated models is comparable to that of centrally trained models for short-term QoL predictions, whereas for long-term periods centralized models provide more accurate QoL predictions. Furthermore, we provide insights into the quality of data preprocessing tasks (missing value imputation and differential privacy).","PeriodicalId":50636,"journal":{"name":"Computer Science and Information Systems","volume":"67 1","pages":"381-404"},"PeriodicalIF":1.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The application of machine learning techniques in prediction of quality of life features for cancer patients\",\"authors\":\"Milos Savic, V. Kurbalija, Mihailo Ilic, M. Ivanović, D. Jakovetić, A. Valachis, Serge Autexier, Johannes Rust, T. Kosmidis\",\"doi\":\"10.2298/csis220227061s\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quality of life (QoL) is one of the major issues for cancer patients. With the advent of medical databases containing large amounts of relevant QoL information it becomes possible to train predictive QoL models by machine learning (ML) techniques. However, the training of predictive QoL models poses several challenges mostly due to data privacy concerns and missing values in patient data. In this paper, we analyze several classification and regression ML models predicting QoL indicators for breast and prostate cancer patients. Three different approaches are employed for imputing missing values, and several settings for data privacy preserving are tested. The examined ML models are trained on datasets formed from two databases containing a large number of anonymized medical records of cancer patients from Sweden. Two learning scenarios are considered: centralized and federated learning. In the centralized learning scenario all patient data coming from different data sources is collected at a central location prior to model training. On the other hand, federated learning enables collective training of machine learning models without data sharing. The results of our experimental evaluation show that the predictive power of federated models is comparable to that of centrally trained models for short-term QoL predictions, whereas for long-term periods centralized models provide more accurate QoL predictions. Furthermore, we provide insights into the quality of data preprocessing tasks (missing value imputation and differential privacy).\",\"PeriodicalId\":50636,\"journal\":{\"name\":\"Computer Science and Information Systems\",\"volume\":\"67 1\",\"pages\":\"381-404\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Science and Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.2298/csis220227061s\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Science and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2298/csis220227061s","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
The application of machine learning techniques in prediction of quality of life features for cancer patients
Quality of life (QoL) is one of the major issues for cancer patients. With the advent of medical databases containing large amounts of relevant QoL information it becomes possible to train predictive QoL models by machine learning (ML) techniques. However, the training of predictive QoL models poses several challenges mostly due to data privacy concerns and missing values in patient data. In this paper, we analyze several classification and regression ML models predicting QoL indicators for breast and prostate cancer patients. Three different approaches are employed for imputing missing values, and several settings for data privacy preserving are tested. The examined ML models are trained on datasets formed from two databases containing a large number of anonymized medical records of cancer patients from Sweden. Two learning scenarios are considered: centralized and federated learning. In the centralized learning scenario all patient data coming from different data sources is collected at a central location prior to model training. On the other hand, federated learning enables collective training of machine learning models without data sharing. The results of our experimental evaluation show that the predictive power of federated models is comparable to that of centrally trained models for short-term QoL predictions, whereas for long-term periods centralized models provide more accurate QoL predictions. Furthermore, we provide insights into the quality of data preprocessing tasks (missing value imputation and differential privacy).
期刊介绍:
About the journal
Home page
Contact information
Aims and scope
Indexing information
Editorial policies
ComSIS consortium
Journal boards
Managing board
For authors
Information for contributors
Paper submission
Article submission through OJS
Copyright transfer form
Download section
For readers
Forthcoming articles
Current issue
Archive
Subscription
For reviewers
View and review submissions
News
Journal''s Facebook page
Call for special issue
New issue notification
Aims and scope
Computer Science and Information Systems (ComSIS) is an international refereed journal, published in Serbia. The objective of ComSIS is to communicate important research and development results in the areas of computer science, software engineering, and information systems.