PViT-AIR: Puzzling vision transformer-based affine image registration for multi histopathology and faxitron images of breast tissue

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2024-09-30 DOI:10.1016/j.media.2024.103356

Negar Golestani , Aihui Wang , Golnaz Moallem , Gregory R. Bean , Mirabela Rusu

{"title":"PViT-AIR: Puzzling vision transformer-based affine image registration for multi histopathology and faxitron images of breast tissue","authors":"Negar Golestani , Aihui Wang , Golnaz Moallem , Gregory R. Bean , Mirabela Rusu","doi":"10.1016/j.media.2024.103356","DOIUrl":null,"url":null,"abstract":"<div><div>Breast cancer is a significant global public health concern, with various treatment options available based on tumor characteristics. Pathological examination of excision specimens after surgery provides essential information for treatment decisions. However, the manual selection of representative sections for histological examination is laborious and subjective, leading to potential sampling errors and variability, especially in carcinomas that have been previously treated with chemotherapy. Furthermore, the accurate identification of residual tumors presents significant challenges, emphasizing the need for systematic or assisted methods to address this issue. In order to enable the development of deep-learning algorithms for automated cancer detection on radiology images, it is crucial to perform radiology-pathology registration, which ensures the generation of accurately labeled ground truth data. The alignment of radiology and histopathology images plays a critical role in establishing reliable cancer labels for training deep-learning algorithms on radiology images. However, aligning these images is challenging due to their content and resolution differences, tissue deformation, artifacts, and imprecise correspondence. We present a novel deep learning-based pipeline for the affine registration of faxitron images, the x-ray representations of macrosections of ex-vivo breast tissue, and their corresponding histopathology images of tissue segments. The proposed model combines convolutional neural networks and vision transformers, allowing it to effectively capture both local and global information from the entire tissue macrosection as well as its segments. This integrated approach enables simultaneous registration and stitching of image segments, facilitating segment-to-macrosection registration through a puzzling-based mechanism. To address the limitations of multi-modal ground truth data, we tackle the problem by training the model using synthetic mono-modal data in a weakly supervised manner. The trained model demonstrated successful performance in multi-modal registration, yielding registration results with an average landmark error of 1.51 mm <span><math><mrow><mo>(</mo><mo>±</mo><mn>2</mn><mo>.</mo><mn>40</mn><mo>)</mo></mrow></math></span>, and stitching distance of 1.15 mm <span><math><mrow><mo>(</mo><mo>±</mo><mn>0</mn><mo>.</mo><mn>94</mn><mo>)</mo></mrow></math></span>. The results indicate that the model performs significantly better than existing baselines, including both deep learning-based and iterative models, and it is also approximately 200 times faster than the iterative approach. This work bridges the gap in the current research and clinical workflow and has the potential to improve efficiency and accuracy in breast cancer evaluation and streamline pathology workflow.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103356"},"PeriodicalIF":10.7000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841524002810","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Breast cancer is a significant global public health concern, with various treatment options available based on tumor characteristics. Pathological examination of excision specimens after surgery provides essential information for treatment decisions. However, the manual selection of representative sections for histological examination is laborious and subjective, leading to potential sampling errors and variability, especially in carcinomas that have been previously treated with chemotherapy. Furthermore, the accurate identification of residual tumors presents significant challenges, emphasizing the need for systematic or assisted methods to address this issue. In order to enable the development of deep-learning algorithms for automated cancer detection on radiology images, it is crucial to perform radiology-pathology registration, which ensures the generation of accurately labeled ground truth data. The alignment of radiology and histopathology images plays a critical role in establishing reliable cancer labels for training deep-learning algorithms on radiology images. However, aligning these images is challenging due to their content and resolution differences, tissue deformation, artifacts, and imprecise correspondence. We present a novel deep learning-based pipeline for the affine registration of faxitron images, the x-ray representations of macrosections of ex-vivo breast tissue, and their corresponding histopathology images of tissue segments. The proposed model combines convolutional neural networks and vision transformers, allowing it to effectively capture both local and global information from the entire tissue macrosection as well as its segments. This integrated approach enables simultaneous registration and stitching of image segments, facilitating segment-to-macrosection registration through a puzzling-based mechanism. To address the limitations of multi-modal ground truth data, we tackle the problem by training the model using synthetic mono-modal data in a weakly supervised manner. The trained model demonstrated successful performance in multi-modal registration, yielding registration results with an average landmark error of 1.51 mm

(\pm 2.40)

, and stitching distance of 1.15 mm

(\pm 0.94)

. The results indicate that the model performs significantly better than existing baselines, including both deep learning-based and iterative models, and it is also approximately 200 times faster than the iterative approach. This work bridges the gap in the current research and clinical workflow and has the potential to improve efficiency and accuracy in breast cancer evaluation and streamline pathology workflow.

查看原文本刊更多论文

PViT-AIR：基于视觉变换器的乳腺组织多组织病理学和传真电子图像仿射图像配准。

乳腺癌是全球关注的重大公共卫生问题，根据肿瘤的特点有多种治疗方案可供选择。手术后切除标本的病理检查为治疗决策提供了重要信息。然而，人工选择有代表性的切片进行组织学检查既费力又主观，可能会导致取样错误和变异，尤其是对于之前接受过化疗的癌肿。此外，准确识别残留肿瘤也是一项重大挑战，因此需要采用系统或辅助方法来解决这一问题。为了开发用于放射学图像癌症自动检测的深度学习算法，必须进行放射学-病理学配准，以确保生成准确标记的地面真实数据。放射学图像和组织病理学图像的配准在建立可靠的癌症标签以在放射学图像上训练深度学习算法方面起着至关重要的作用。然而，由于这些图像的内容和分辨率不同、组织变形、伪像和不精确的对应关系，对齐这些图像具有挑战性。我们提出了一种基于深度学习的新型管道，用于对传真电子图像、体外乳腺组织大切片的 X 射线表示及其相应组织切片的组织病理学图像进行仿射配准。所提出的模型结合了卷积神经网络和视觉转换器，能有效捕捉整个组织宏切片及其片段的局部和全局信息。这种集成方法可同时进行图像片段的配准和拼接，通过基于拼图的机制促进片段到宏观切片的配准。为了解决多模态地面实况数据的局限性，我们使用合成的单模态数据以弱监督的方式训练模型。训练后的模型在多模态配准中表现出色，配准结果的平均地标误差为 1.51 毫米（±2.40），拼接距离为 1.15 毫米（±0.94）。结果表明，该模型的性能明显优于现有基线，包括基于深度学习的模型和迭代模型，而且比迭代方法快约 200 倍。这项工作弥补了目前研究和临床工作流程的差距，有望提高乳腺癌评估的效率和准确性，简化病理工作流程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.