Scientific hypothesis generation by large language models: laboratory validation in breast cancer treatment.

IF 3.5 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Journal of The Royal Society Interface Pub Date : 2025-06-01 Epub Date: 2025-06-04 DOI:10.1098/rsif.2024.0674

Abbi Abdel-Rehim, Hector Zenil, Oghenejokpeme Orhobor, Marie Fisher, Ross J Collins, Elizabeth Bourne, Gareth W Fearnley, Emma Tate, Holly X Smith, Larisa N Soldatova, Ross King

{"title":"Scientific hypothesis generation by large language models: laboratory validation in breast cancer treatment.","authors":"Abbi Abdel-Rehim, Hector Zenil, Oghenejokpeme Orhobor, Marie Fisher, Ross J Collins, Elizabeth Bourne, Gareth W Fearnley, Emma Tate, Holly X Smith, Larisa N Soldatova, Ross King","doi":"10.1098/rsif.2024.0674","DOIUrl":null,"url":null,"abstract":"<p><p>Large language models (LLMs) have transformed artificial intelligence (AI) and achieved breakthrough performance on a wide range of tasks. In science, the most interesting application of LLMs is for hypothesis formation. A feature of LLMs, which results from their probabilistic structure, is that the output text is not necessarily a valid inference from the training text. These are termed 'hallucinations', and are harmful in many applications. In science, some hallucinations may be useful: novel hypotheses whose validity may be tested by laboratory experiments. Here, we experimentally test the application of LLMs as a source of scientific hypotheses using the domain of breast cancer treatment. We applied the LLM GPT4 to hypothesize novel synergistic pairs of US Food and Drug Administration (FDA)-approved non-cancer drugs that target the MCF7 breast cancer cell line relative to the non-tumorigenic breast cell line MCF10A. In the first round of laboratory experiments, GPT4 succeeded in discovering three drug combinations (out of 12 tested) with synergy scores above the positive controls. GPT4 then generated new combinations based on its initial results, this generated three more combinations with positive synergy scores (out of four tested). We conclude that LLMs are a valuable source of scientific hypotheses.</p>","PeriodicalId":17488,"journal":{"name":"Journal of The Royal Society Interface","volume":"22 227","pages":"20240674"},"PeriodicalIF":3.5000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12134935/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of The Royal Society Interface","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1098/rsif.2024.0674","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Large language models (LLMs) have transformed artificial intelligence (AI) and achieved breakthrough performance on a wide range of tasks. In science, the most interesting application of LLMs is for hypothesis formation. A feature of LLMs, which results from their probabilistic structure, is that the output text is not necessarily a valid inference from the training text. These are termed 'hallucinations', and are harmful in many applications. In science, some hallucinations may be useful: novel hypotheses whose validity may be tested by laboratory experiments. Here, we experimentally test the application of LLMs as a source of scientific hypotheses using the domain of breast cancer treatment. We applied the LLM GPT4 to hypothesize novel synergistic pairs of US Food and Drug Administration (FDA)-approved non-cancer drugs that target the MCF7 breast cancer cell line relative to the non-tumorigenic breast cell line MCF10A. In the first round of laboratory experiments, GPT4 succeeded in discovering three drug combinations (out of 12 tested) with synergy scores above the positive controls. GPT4 then generated new combinations based on its initial results, this generated three more combinations with positive synergy scores (out of four tested). We conclude that LLMs are a valuable source of scientific hypotheses.

Abstract Image

查看原文本刊更多论文

大语言模型的科学假设生成：乳腺癌治疗的实验室验证。

大型语言模型（llm）已经改变了人工智能（AI），并在广泛的任务上取得了突破性的表现。在科学领域，法学硕士最有趣的应用是假设形成。llm的一个特征是输出文本不一定是训练文本的有效推断，这是由它们的概率结构产生的。这些被称为“幻觉”，在许多应用中是有害的。在科学上，一些幻觉可能是有用的：新的假设的有效性可以通过实验室实验来检验。在这里，我们通过实验测试llm在乳腺癌治疗领域作为科学假设来源的应用。我们应用LLM GPT4来假设美国食品和药物管理局（FDA）批准的针对MCF7乳腺癌细胞系的非癌症药物相对于非致瘤性乳腺癌细胞系MCF10A的新型协同作用对。在第一轮实验室实验中，GPT4成功发现了3个协同作用评分高于阳性对照的药物组合（在12个被测试的药物组合中）。GPT4然后根据其初始结果生成新的组合，这产生了另外三个具有正协同得分的组合（在四个测试中）。我们得出结论，法学硕士是科学假设的宝贵来源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of The Royal Society Interface 综合性期刊-综合性期刊

CiteScore

7.10

自引率

2.60%

发文量

234

审稿时长

2.5 months

期刊介绍： J. R. Soc. Interface welcomes articles of high quality research at the interface of the physical and life sciences. It provides a high-quality forum to publish rapidly and interact across this boundary in two main ways: J. R. Soc. Interface publishes research applying chemistry, engineering, materials science, mathematics and physics to the biological and medical sciences; it also highlights discoveries in the life sciences of relevance to the physical sciences. Both sides of the interface are considered equally and it is one of the only journals to cover this exciting new territory. J. R. Soc. Interface welcomes contributions on a diverse range of topics, including but not limited to; biocomplexity, bioengineering, bioinformatics, biomaterials, biomechanics, bionanoscience, biophysics, chemical biology, computer science (as applied to the life sciences), medical physics, synthetic biology, systems biology, theoretical biology and tissue engineering.