{"title":"利用大型语言模型对医学影像进行交互式计算机辅助诊断","authors":"Sheng Wang, Zihao Zhao, Xi Ouyang, Tianming Liu, Qian Wang, Dinggang Shen","doi":"10.1038/s44172-024-00271-8","DOIUrl":null,"url":null,"abstract":"Computer-aided diagnosis (CAD) has advanced medical image analysis, while large language models (LLMs) have shown potential in clinical applications. However, LLMs struggle to interpret medical images, which are critical for decision-making. Here we show a strategy integrating LLMs with CAD networks. The framework uses LLMs’ medical knowledge and reasoning to enhance CAD network outputs, such as diagnosis, lesion segmentation, and report generation, by summarizing information in natural language. The generated reports are of higher quality and can improve the performance of vision-based CAD models. In chest X-rays, an LLM using ChatGPT improved diagnosis performance by 16.42 percentage points compared to state-of-the-art models, while GPT-3 provided a 15.00 percentage point F1-score improvement. Our strategy allows accurate report generation and creates a patient-friendly interactive system, unlike conventional CAD systems only understood by professionals. This approach has the potential to revolutionize clinical decision-making and patient communication. Wang et al. developed a machine learning strategy for improving large language model to understand and analyse visual medical information. Their framework seamlessly integrates medical image computer-aided diagnosis networks with large language models, converting medical image inputs into a clear and concise textual summary of the patient’s condition.","PeriodicalId":72644,"journal":{"name":"Communications engineering","volume":" ","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s44172-024-00271-8.pdf","citationCount":"0","resultStr":"{\"title\":\"Interactive computer-aided diagnosis on medical image using large language models\",\"authors\":\"Sheng Wang, Zihao Zhao, Xi Ouyang, Tianming Liu, Qian Wang, Dinggang Shen\",\"doi\":\"10.1038/s44172-024-00271-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computer-aided diagnosis (CAD) has advanced medical image analysis, while large language models (LLMs) have shown potential in clinical applications. However, LLMs struggle to interpret medical images, which are critical for decision-making. Here we show a strategy integrating LLMs with CAD networks. The framework uses LLMs’ medical knowledge and reasoning to enhance CAD network outputs, such as diagnosis, lesion segmentation, and report generation, by summarizing information in natural language. The generated reports are of higher quality and can improve the performance of vision-based CAD models. In chest X-rays, an LLM using ChatGPT improved diagnosis performance by 16.42 percentage points compared to state-of-the-art models, while GPT-3 provided a 15.00 percentage point F1-score improvement. Our strategy allows accurate report generation and creates a patient-friendly interactive system, unlike conventional CAD systems only understood by professionals. This approach has the potential to revolutionize clinical decision-making and patient communication. Wang et al. developed a machine learning strategy for improving large language model to understand and analyse visual medical information. Their framework seamlessly integrates medical image computer-aided diagnosis networks with large language models, converting medical image inputs into a clear and concise textual summary of the patient’s condition.\",\"PeriodicalId\":72644,\"journal\":{\"name\":\"Communications engineering\",\"volume\":\" \",\"pages\":\"1-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.nature.com/articles/s44172-024-00271-8.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s44172-024-00271-8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44172-024-00271-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interactive computer-aided diagnosis on medical image using large language models
Computer-aided diagnosis (CAD) has advanced medical image analysis, while large language models (LLMs) have shown potential in clinical applications. However, LLMs struggle to interpret medical images, which are critical for decision-making. Here we show a strategy integrating LLMs with CAD networks. The framework uses LLMs’ medical knowledge and reasoning to enhance CAD network outputs, such as diagnosis, lesion segmentation, and report generation, by summarizing information in natural language. The generated reports are of higher quality and can improve the performance of vision-based CAD models. In chest X-rays, an LLM using ChatGPT improved diagnosis performance by 16.42 percentage points compared to state-of-the-art models, while GPT-3 provided a 15.00 percentage point F1-score improvement. Our strategy allows accurate report generation and creates a patient-friendly interactive system, unlike conventional CAD systems only understood by professionals. This approach has the potential to revolutionize clinical decision-making and patient communication. Wang et al. developed a machine learning strategy for improving large language model to understand and analyse visual medical information. Their framework seamlessly integrates medical image computer-aided diagnosis networks with large language models, converting medical image inputs into a clear and concise textual summary of the patient’s condition.