基于知识检索和多层次区域特征选择的全切片病理报告生成

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-02-27 DOI:10.1016/j.cmpb.2025.108677

Dingyi Hu , Zhiguo Jiang , Jun Shi , Fengying Xie , Kun Wu , Kunming Tang , Ming Cao , Jianguo Huai , Yushan Zheng

{"title":"基于知识检索和多层次区域特征选择的全切片病理报告生成","authors":"Dingyi Hu , Zhiguo Jiang , Jun Shi , Fengying Xie , Kun Wu , Kunming Tang , Ming Cao , Jianguo Huai , Yushan Zheng","doi":"10.1016/j.cmpb.2025.108677","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objectives:</h3><div>With the development of deep learning techniques, the computer-assisted pathology diagnosis plays a crucial role in clinical diagnosis. An important task within this field is report generation, which provides doctors with text descriptions of whole slide images (WSIs). Report generation from WSIs presents significant challenges due to the structural complexity and pathological diversity of tissues, as well as the large size and high information density of WSIs. The objective of this study is to design a histopathology report generation method that can efficiently generate reports from WSIs and is suitable for clinical practice.</div></div><div><h3>Methods:</h3><div>In this paper, we propose a novel approach for generating pathology reports from WSIs, leveraging knowledge retrieval and multi-level regional feature selection. To deal with the uneven distribution of pathological information in WSIs, we introduce a multi-level regional feature encoding network and a feature selection module that extracts multi-level region representations and filters out region features irrelevant to the diagnosis, enabling more efficient report generation. Moreover, we design a knowledge retrieval module to improve the report generation performance that can leverage the diagnostic information from historical cases. Additionally, we propose an out-of-domain application mode based on large language model (LLM). The use of LLM enhances the scalability of the generation model and improves its adaptability to data from different sources.</div></div><div><h3>Results:</h3><div>The proposed method is evaluated on a public datasets and one in-house dataset. On the public GastricADC (991 WSIs), our method outperforms state-of-the-art text generation methods and achieved 0.568 and 0.345 on metric Rouge-L and Bleu-4, respectively. On the in-house Gastric-3300 (3309 WSIs), our method achieved significantly better performance with Rouge-L of 0.690, which surpassed the second-best state-of-the-art method Wcap 6.3%.</div></div><div><h3>Conclusions:</h3><div>We present an advanced method for pathology report generation from WSIs, addressing the key challenges associated with the large size and complex pathological structures of these images. In particular, the multi-level regional feature selection module effectively captures diagnostically significant regions of varying sizes. The knowledge retrieval-based decoder leverages historical diagnostic data to enhance report accuracy. Our method not only improves the informativeness and relevance of the generated pathology reports but also outperforms the state-of-the-art techniques.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"263 ","pages":"Article 108677"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pathology report generation from whole slide images with knowledge retrieval and multi-level regional feature selection\",\"authors\":\"Dingyi Hu , Zhiguo Jiang , Jun Shi , Fengying Xie , Kun Wu , Kunming Tang , Ming Cao , Jianguo Huai , Yushan Zheng\",\"doi\":\"10.1016/j.cmpb.2025.108677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and objectives:</h3><div>With the development of deep learning techniques, the computer-assisted pathology diagnosis plays a crucial role in clinical diagnosis. An important task within this field is report generation, which provides doctors with text descriptions of whole slide images (WSIs). Report generation from WSIs presents significant challenges due to the structural complexity and pathological diversity of tissues, as well as the large size and high information density of WSIs. The objective of this study is to design a histopathology report generation method that can efficiently generate reports from WSIs and is suitable for clinical practice.</div></div><div><h3>Methods:</h3><div>In this paper, we propose a novel approach for generating pathology reports from WSIs, leveraging knowledge retrieval and multi-level regional feature selection. To deal with the uneven distribution of pathological information in WSIs, we introduce a multi-level regional feature encoding network and a feature selection module that extracts multi-level region representations and filters out region features irrelevant to the diagnosis, enabling more efficient report generation. Moreover, we design a knowledge retrieval module to improve the report generation performance that can leverage the diagnostic information from historical cases. Additionally, we propose an out-of-domain application mode based on large language model (LLM). The use of LLM enhances the scalability of the generation model and improves its adaptability to data from different sources.</div></div><div><h3>Results:</h3><div>The proposed method is evaluated on a public datasets and one in-house dataset. On the public GastricADC (991 WSIs), our method outperforms state-of-the-art text generation methods and achieved 0.568 and 0.345 on metric Rouge-L and Bleu-4, respectively. On the in-house Gastric-3300 (3309 WSIs), our method achieved significantly better performance with Rouge-L of 0.690, which surpassed the second-best state-of-the-art method Wcap 6.3%.</div></div><div><h3>Conclusions:</h3><div>We present an advanced method for pathology report generation from WSIs, addressing the key challenges associated with the large size and complex pathological structures of these images. In particular, the multi-level regional feature selection module effectively captures diagnostically significant regions of varying sizes. The knowledge retrieval-based decoder leverages historical diagnostic data to enhance report accuracy. Our method not only improves the informativeness and relevance of the generated pathology reports but also outperforms the state-of-the-art techniques.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"263 \",\"pages\":\"Article 108677\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016926072500094X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016926072500094X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

背景与目的：随着深度学习技术的发展，计算机辅助病理诊断在临床诊断中发挥着至关重要的作用。该领域的一项重要任务是报告生成，它为医生提供整个幻灯片图像（WSIs）的文本描述。由于组织的结构复杂性和病理多样性，以及wsi的大尺寸和高信息密度，wsi的报告生成面临着巨大的挑战。本研究的目的是设计一种组织病理学报告生成方法，该方法可以有效地从wsi中生成报告，并适用于临床实践。方法：在本文中，我们提出了一种利用知识检索和多层次区域特征选择从wsi生成病理报告的新方法。为了解决wsi中病理信息分布不均匀的问题，我们引入了多级区域特征编码网络和特征选择模块，提取多级区域表示并过滤掉与诊断无关的区域特征，从而提高了报告生成的效率。此外，我们还设计了一个知识检索模块，可以利用历史案例的诊断信息来提高报表生成的性能。此外，我们提出了一种基于大语言模型（LLM）的域外应用模式。LLM的使用增强了生成模型的可扩展性，提高了生成模型对不同来源数据的适应性。结果：该方法在一个公共数据集和一个内部数据集上进行了评估。在公共GastricADC（991个wsi）上，我们的方法优于最先进的文本生成方法，在metric Rouge-L和blue -4上分别达到0.568和0.345。在内部的Gastric-3300 （3309 wsi）上，我们的方法取得了显著更好的性能，Rouge-L为0.690，超过了第二好的最先进方法Wcap 6.3%。结论：我们提出了一种从wsi生成病理报告的先进方法，解决了与这些图像的大尺寸和复杂病理结构相关的关键挑战。特别是，多层次的区域特征选择模块有效地捕获不同大小的诊断重要区域。基于知识检索的解码器利用历史诊断数据来提高报告的准确性。我们的方法不仅提高了生成病理报告的信息量和相关性，而且优于最先进的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pathology report generation from whole slide images with knowledge retrieval and multi-level regional feature selection

Background and objectives:

With the development of deep learning techniques, the computer-assisted pathology diagnosis plays a crucial role in clinical diagnosis. An important task within this field is report generation, which provides doctors with text descriptions of whole slide images (WSIs). Report generation from WSIs presents significant challenges due to the structural complexity and pathological diversity of tissues, as well as the large size and high information density of WSIs. The objective of this study is to design a histopathology report generation method that can efficiently generate reports from WSIs and is suitable for clinical practice.

Methods:

In this paper, we propose a novel approach for generating pathology reports from WSIs, leveraging knowledge retrieval and multi-level regional feature selection. To deal with the uneven distribution of pathological information in WSIs, we introduce a multi-level regional feature encoding network and a feature selection module that extracts multi-level region representations and filters out region features irrelevant to the diagnosis, enabling more efficient report generation. Moreover, we design a knowledge retrieval module to improve the report generation performance that can leverage the diagnostic information from historical cases. Additionally, we propose an out-of-domain application mode based on large language model (LLM). The use of LLM enhances the scalability of the generation model and improves its adaptability to data from different sources.

Results:

The proposed method is evaluated on a public datasets and one in-house dataset. On the public GastricADC (991 WSIs), our method outperforms state-of-the-art text generation methods and achieved 0.568 and 0.345 on metric Rouge-L and Bleu-4, respectively. On the in-house Gastric-3300 (3309 WSIs), our method achieved significantly better performance with Rouge-L of 0.690, which surpassed the second-best state-of-the-art method Wcap 6.3%.

Conclusions:

We present an advanced method for pathology report generation from WSIs, addressing the key challenges associated with the large size and complex pathological structures of these images. In particular, the multi-level regional feature selection module effectively captures diagnostically significant regions of varying sizes. The knowledge retrieval-based decoder leverages historical diagnostic data to enhance report accuracy. Our method not only improves the informativeness and relevance of the generated pathology reports but also outperforms the state-of-the-art techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.