Fabian Hörst , Moritz Rempe , Lukas Heine , Constantin Seibold , Julius Keyl , Giulia Baldini , Selma Ugurel , Jens Siveke , Barbara Grünwald , Jan Egger , Jens Kleesiek
{"title":"CellViT:用于精确细胞分割和分类的视觉转换器","authors":"Fabian Hörst , Moritz Rempe , Lukas Heine , Constantin Seibold , Julius Keyl , Giulia Baldini , Selma Ugurel , Jens Siveke , Barbara Grünwald , Jan Egger , Jens Kleesiek","doi":"10.1016/j.media.2024.103143","DOIUrl":null,"url":null,"abstract":"<div><p>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published <em>Segment Anything Model</em> and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-detection score of 0.83. The code is publicly available at <span>https://github.com/TIO-IKIM/CellViT</span><svg><path></path></svg>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"94 ","pages":"Article 103143"},"PeriodicalIF":10.7000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1361841524000689/pdfft?md5=46fcd2cc27812024574d4d5479fe2cb8&pid=1-s2.0-S1361841524000689-main.pdf","citationCount":"0","resultStr":"{\"title\":\"CellViT: Vision Transformers for precise cell segmentation and classification\",\"authors\":\"Fabian Hörst , Moritz Rempe , Lukas Heine , Constantin Seibold , Julius Keyl , Giulia Baldini , Selma Ugurel , Jens Siveke , Barbara Grünwald , Jan Egger , Jens Kleesiek\",\"doi\":\"10.1016/j.media.2024.103143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published <em>Segment Anything Model</em> and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-detection score of 0.83. The code is publicly available at <span>https://github.com/TIO-IKIM/CellViT</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"94 \",\"pages\":\"Article 103143\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1361841524000689/pdfft?md5=46fcd2cc27812024574d4d5479fe2cb8&pid=1-s2.0-S1361841524000689-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841524000689\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841524000689","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CellViT: Vision Transformers for precise cell segmentation and classification
Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches — achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an -detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.