{"title":"scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction.","authors":"Qing Wang, Yining Pan, Minghao Zhou, Zijia Tang, Yanfei Wang, Guangyu Wang, Qianqian Song","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Drug resistance remains a significant barrier to improving the effectiveness of cancer therapies. To better understand the biological mechanisms driving resistance, single-cell profiling has emerged as a powerful tool for characterizing cellular heterogeneity. Recent advancements in large-scale foundation models have demonstrated potential in enhancing single-cell analysis, yet their performance in drug response prediction remains underexplored. In this study, we developed scDrugMap, an integrated framework for drug response prediction that features both a Python command-line tool and an interactive web server. scDrugMap supports the evaluation of a wide range of foundation models, including eight single-cell foundation models and two large language models (LLMs), using large-scale single-cell datasets across diverse tissue types, cancer types, and treatment regimens. The framework incorporates a curated data resource consisting of a primary collection of 326,751 cells from 36 datasets across 23 studies, and a validation collection of 18,856 cells from 17 datasets across 6 studies. Using scDrugMap, we conducted comprehensive benchmarking under two evaluation scenarios: pooled-data evaluation and cross-data evaluation. In both settings, we implemented two model training strategies-layer freezing and fine-tuning using Low-Rank Adaptation (LoRA) of foundation models. In the pooled-data evaluation, scFoundation outperformed all others, while most models achieved competitive performance. Specifically, scFoundation achieved the highest mean F1 scores of 0.971 and 0.947 using layer-freezing and fine-tuning, outperforming the lowest-performing model by 54% and 57%, respectively. In the cross-data evaluation, UCE achieved the highest performance (mean F1 score: 0.774) after fine-tuning on tumor tissue, while scGPT demonstrated superior performance (mean F1 score: 0.858) in a zero-shot learning setting. Together, this study presents the first comprehensive benchmarking of large-scale foundation models for drug response prediction in single-cell data and introduces a user-friendly, flexible platform to support drug discovery and translational research.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12083700/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Drug resistance remains a significant barrier to improving the effectiveness of cancer therapies. To better understand the biological mechanisms driving resistance, single-cell profiling has emerged as a powerful tool for characterizing cellular heterogeneity. Recent advancements in large-scale foundation models have demonstrated potential in enhancing single-cell analysis, yet their performance in drug response prediction remains underexplored. In this study, we developed scDrugMap, an integrated framework for drug response prediction that features both a Python command-line tool and an interactive web server. scDrugMap supports the evaluation of a wide range of foundation models, including eight single-cell foundation models and two large language models (LLMs), using large-scale single-cell datasets across diverse tissue types, cancer types, and treatment regimens. The framework incorporates a curated data resource consisting of a primary collection of 326,751 cells from 36 datasets across 23 studies, and a validation collection of 18,856 cells from 17 datasets across 6 studies. Using scDrugMap, we conducted comprehensive benchmarking under two evaluation scenarios: pooled-data evaluation and cross-data evaluation. In both settings, we implemented two model training strategies-layer freezing and fine-tuning using Low-Rank Adaptation (LoRA) of foundation models. In the pooled-data evaluation, scFoundation outperformed all others, while most models achieved competitive performance. Specifically, scFoundation achieved the highest mean F1 scores of 0.971 and 0.947 using layer-freezing and fine-tuning, outperforming the lowest-performing model by 54% and 57%, respectively. In the cross-data evaluation, UCE achieved the highest performance (mean F1 score: 0.774) after fine-tuning on tumor tissue, while scGPT demonstrated superior performance (mean F1 score: 0.858) in a zero-shot learning setting. Together, this study presents the first comprehensive benchmarking of large-scale foundation models for drug response prediction in single-cell data and introduces a user-friendly, flexible platform to support drug discovery and translational research.