SensitiveCancerGPT: Leveraging Generative Large Language Model on Structured Omics Data to Optimize Drug Sensitivity Prediction.

Shaika Chowdhury, Sivaraman Rajaganapathy, Lichao Sun, Liewei Wang, Ping Yang, James R Cerhan, Nansu Zong
{"title":"SensitiveCancerGPT: Leveraging Generative Large Language Model on Structured Omics Data to Optimize Drug Sensitivity Prediction.","authors":"Shaika Chowdhury, Sivaraman Rajaganapathy, Lichao Sun, Liewei Wang, Ping Yang, James R Cerhan, Nansu Zong","doi":"10.1101/2025.02.27.640661","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The fast accumulation of vast pharmacogenomics data of cancer cell lines provide unprecedented opportunities for drug sensitivity prediction (DSP), a crucial prerequisite for the advancement of precision oncology. Recently, Generative Large Language Models (LLM) have demonstrated performance and generalization prowess across diverse tasks in the field of natural language processing (NLP). However, the structured format of the pharmacogenomics data poses challenge for the utility of LLM in DSP. Therefore, the objective of this study is multi-fold: to adapt prompt engineering for structured pharmacogenomics data toward optimizing LLM's DSP performance, to evaluate LLM's generalization in real-world DSP scenarios, and to compare LLM's DSP performance against that of state-of-the-science baselines.</p><p><strong>Methods: </strong>We systematically investigated the capability of the Generative Pre-trained Transformer (GPT) as a DSP model on four publicly available benchmark pharmacogenomics datasets, which are stratified by five cancer tissue types of cell lines and encompass both oncology and non-oncology drugs. Essentially, the predictive landscape of GPT is assessed for effectiveness on the DSP task via four learning paradigms: zero-shot learning, few-shot learning, fine-tuning and clustering pretrained embeddings. To facilitate GPT in seamlessly processing the structured pharmacogenomics data, domain-specific novel prompt engineering is employed by implementing three prompt templates (i.e., Instruction, Instruction-Prefix, Cloze) and integrating pharmacogenomics-related features into the prompt. We validated GPT's performance in diverse real-world DSP scenarios: cross-tissue generalization, blind tests, and analyses of drug-pathway associations and top sensitive/resistant cell lines. Furthermore, we conducted a comparative evaluation of GPT against multiple Transformer-based pretrained models and existing DSP baselines.</p><p><strong>Results: </strong>Extensive experiments on the pharmacogenomics datasets across the five tissue cohorts demonstrate that fine-tuning GPT yields the best DSP performance (28% F1 increase, p-value= 0.0003) followed by clustering pretrained GPT embeddings (26% F1 increase, p-value= 0.0005), outperforming GPT in-context learning (i.e., few-shot). However, GPT in the zero-shot setting had a big F1 gap, resulting in the worst performance. Within the scope of prompt engineering, performance enhancement was achieved by directly instructing GPT about the DSP task and resorting to a concise context format (i.e., instruction-prefix), leading to F1 performance gain of 22% (p-value=0.02); while incorporation of drug-cell line prompt context derived from genomics and/or molecular features further boosted F1 score by 2%. Compared to state-of-the-science DSP baselines, GPT significantly asserted superior mean F1 performance (16% gain, p-value<0.05) on the GDSC dataset. In the cross-tissue analysis, GPT showcased comparable generalizability to the within-tissue performances on the GDSC and PRISM datasets, while statistically significant F1 performance improvements on the CCLE (8%, p-value=0.001) and DrugComb (19%, p-value=0.009) datasets. Evaluation on the challenging blind tests suggests GPT's competitiveness on the CCLE and DrugComb datasets compared to random splitting. Furthermore, analyses of the drug-pathway associations and log probabilities provided valuable insights that align with previous DSP findings.</p><p><strong>Conclusion: </strong>The diverse experiment setups and in-depth analysis underscore the importance of generative LLM, such as GPT, as a viable in silico approach to guide precision oncology.</p><p><strong>Availability: </strong>https://github.com/bioIKEA/SensitiveCancerGPT.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11888479/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.02.27.640661","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: The fast accumulation of vast pharmacogenomics data of cancer cell lines provide unprecedented opportunities for drug sensitivity prediction (DSP), a crucial prerequisite for the advancement of precision oncology. Recently, Generative Large Language Models (LLM) have demonstrated performance and generalization prowess across diverse tasks in the field of natural language processing (NLP). However, the structured format of the pharmacogenomics data poses challenge for the utility of LLM in DSP. Therefore, the objective of this study is multi-fold: to adapt prompt engineering for structured pharmacogenomics data toward optimizing LLM's DSP performance, to evaluate LLM's generalization in real-world DSP scenarios, and to compare LLM's DSP performance against that of state-of-the-science baselines.

Methods: We systematically investigated the capability of the Generative Pre-trained Transformer (GPT) as a DSP model on four publicly available benchmark pharmacogenomics datasets, which are stratified by five cancer tissue types of cell lines and encompass both oncology and non-oncology drugs. Essentially, the predictive landscape of GPT is assessed for effectiveness on the DSP task via four learning paradigms: zero-shot learning, few-shot learning, fine-tuning and clustering pretrained embeddings. To facilitate GPT in seamlessly processing the structured pharmacogenomics data, domain-specific novel prompt engineering is employed by implementing three prompt templates (i.e., Instruction, Instruction-Prefix, Cloze) and integrating pharmacogenomics-related features into the prompt. We validated GPT's performance in diverse real-world DSP scenarios: cross-tissue generalization, blind tests, and analyses of drug-pathway associations and top sensitive/resistant cell lines. Furthermore, we conducted a comparative evaluation of GPT against multiple Transformer-based pretrained models and existing DSP baselines.

Results: Extensive experiments on the pharmacogenomics datasets across the five tissue cohorts demonstrate that fine-tuning GPT yields the best DSP performance (28% F1 increase, p-value= 0.0003) followed by clustering pretrained GPT embeddings (26% F1 increase, p-value= 0.0005), outperforming GPT in-context learning (i.e., few-shot). However, GPT in the zero-shot setting had a big F1 gap, resulting in the worst performance. Within the scope of prompt engineering, performance enhancement was achieved by directly instructing GPT about the DSP task and resorting to a concise context format (i.e., instruction-prefix), leading to F1 performance gain of 22% (p-value=0.02); while incorporation of drug-cell line prompt context derived from genomics and/or molecular features further boosted F1 score by 2%. Compared to state-of-the-science DSP baselines, GPT significantly asserted superior mean F1 performance (16% gain, p-value<0.05) on the GDSC dataset. In the cross-tissue analysis, GPT showcased comparable generalizability to the within-tissue performances on the GDSC and PRISM datasets, while statistically significant F1 performance improvements on the CCLE (8%, p-value=0.001) and DrugComb (19%, p-value=0.009) datasets. Evaluation on the challenging blind tests suggests GPT's competitiveness on the CCLE and DrugComb datasets compared to random splitting. Furthermore, analyses of the drug-pathway associations and log probabilities provided valuable insights that align with previous DSP findings.

Conclusion: The diverse experiment setups and in-depth analysis underscore the importance of generative LLM, such as GPT, as a viable in silico approach to guide precision oncology.

Availability: https://github.com/bioIKEA/SensitiveCancerGPT.

求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信