An investigation of cancer cell line-based drug response prediction methods on patient data

2020 12th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2020-11-12 DOI:10.1109/KSE50997.2020.9287633

Giang T. T. Nguyen, Le Due Hoang, Q. Nguyen, T. Nguyen, Hien Dang, Duc-Hau Le

{"title":"An investigation of cancer cell line-based drug response prediction methods on patient data","authors":"Giang T. T. Nguyen, Le Due Hoang, Q. Nguyen, T. Nguyen, Hien Dang, Duc-Hau Le","doi":"10.1109/KSE50997.2020.9287633","DOIUrl":null,"url":null,"abstract":"The most significant goal of precision medicine is to identify the right treatment for individual patients based on their molecular profiles. Several big projects have been provided with a large amount of -omics and drug response data for human cell lines such as GDSC and CCLE and for patients such as GEO. Based on these useful datasets, many computational methods are increasingly being applied to predict not only untested drug responses on cell lines but also those on the patients. Such approaches built prediction models for drug response on cell line data then applied the learned models to predict drug response on the patient. In this way, it also helps to tackle the disparity between models trained on cell lines and their clinical applications. However, the datasets are highly heterogeneous in terms of the used array techniques, drug response measurements, and so on, thus leading to inconsistent results across computational methods on different datasets. Therefore, in this study, we assessed seven machine learning models built on the cell line datasets and then applied them to the patient datasets. Experimental results show that models built on pan-cancer cell lines cannot work well on every cancer-specific patient dataset Also, patient datasets with larger sizes were suggested to measure the prediction performance of each method correctly.","PeriodicalId":275683,"journal":{"name":"2020 12th International Conference on Knowledge and Systems Engineering (KSE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 12th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE50997.2020.9287633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The most significant goal of precision medicine is to identify the right treatment for individual patients based on their molecular profiles. Several big projects have been provided with a large amount of -omics and drug response data for human cell lines such as GDSC and CCLE and for patients such as GEO. Based on these useful datasets, many computational methods are increasingly being applied to predict not only untested drug responses on cell lines but also those on the patients. Such approaches built prediction models for drug response on cell line data then applied the learned models to predict drug response on the patient. In this way, it also helps to tackle the disparity between models trained on cell lines and their clinical applications. However, the datasets are highly heterogeneous in terms of the used array techniques, drug response measurements, and so on, thus leading to inconsistent results across computational methods on different datasets. Therefore, in this study, we assessed seven machine learning models built on the cell line datasets and then applied them to the patient datasets. Experimental results show that models built on pan-cancer cell lines cannot work well on every cancer-specific patient dataset Also, patient datasets with larger sizes were suggested to measure the prediction performance of each method correctly.

查看原文本刊更多论文

基于患者数据的肿瘤细胞系药物反应预测方法的研究

精准医疗最重要的目标是根据个体患者的分子特征确定正确的治疗方法。多个大型项目为GDSC、CCLE等人类细胞系和GEO等患者提供了大量的组学和药物反应数据。基于这些有用的数据集，许多计算方法越来越多地被应用于预测未经测试的药物对细胞系的反应，以及对患者的反应。这种方法基于细胞系数据建立药物反应的预测模型，然后应用所学模型来预测患者的药物反应。通过这种方式，它还有助于解决细胞系训练模型与其临床应用之间的差异。然而，数据集在使用的阵列技术、药物反应测量等方面高度异构，从而导致不同数据集上的计算方法结果不一致。因此，在本研究中，我们评估了建立在细胞系数据集上的7个机器学习模型，然后将它们应用于患者数据集。实验结果表明，建立在泛癌细胞系上的模型不能很好地适用于每一个癌症特异性患者数据集，并且建议使用更大的患者数据集来正确衡量每种方法的预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 12th International Conference on Knowledge and Systems Engineering (KSE)

自引率

0.00%

发文量