Spatial Mapping of Gene Signatures in Hematoxylin and Eosin-Stained Images: A Proof of Concept for Interpretable Predictions Using Additive Multiple Instance Learning
{"title":"Spatial Mapping of Gene Signatures in Hematoxylin and Eosin-Stained Images: A Proof of Concept for Interpretable Predictions Using Additive Multiple Instance Learning","authors":"Miles Markey, Juhyun Kim, Zvi Goldstein, Ylaine Gerardin, Jacqueline Brosnan-Cashman, Syed Ashar Javed, Dinkar Juyal, Harshith Pagidela, Limin Yu, Bahar Rahsepar, John Abel, Stephanie Hennek, Archit Khosla, Amaro Taylor-Weiner, Chintan Parmar","doi":"10.1016/j.modpat.2025.100772","DOIUrl":null,"url":null,"abstract":"<div><div>The relative abundance of cancer-associated fibroblast (CAF) subtypes influences a tumor’s response to treatment, especially immunotherapy. However, the gene expression signatures associated with these CAF subtypes have yet to realize their potential as clinical biomarkers. Here, we describe an interpretable machine learning approach, additive multiple instance learning (aMIL), to predict bulk gene expression signatures from hematoxylin and eosin-stained whole-slide images, focusing on an immunosuppressive LRRC15+ CAF-enriched TGFβ-CAF signature. aMIL models accurately predicted TGFβ-CAF across various cancer types. Tissue regions contributing most highly to slide-level predictions of TGFβ-CAF were evaluated by machine learning models characterizing spatial distributions of diverse cell and tissue types, stromal subtypes, and nuclear morphology. In breast cancer, regions contributing most to TGFβ-CAF-high predictions (“excitatory”) were localized to cancer stroma with high fibroblast density and mature collagen fibers. Regions contributing most to TGFβ-CAF-low predictions (“inhibitory”) were localized to cancer epithelium and densely inflamed stroma. Fibroblast and lymphocyte nuclear morphology also differed between excitatory and inhibitory regions. Thus, aMIL enables a data-driven link between histologic features and transcription, offering biological interpretability beyond typical black-box models.</div></div>","PeriodicalId":18706,"journal":{"name":"Modern Pathology","volume":"38 8","pages":"Article 100772"},"PeriodicalIF":5.5000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Modern Pathology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893395225000687","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The relative abundance of cancer-associated fibroblast (CAF) subtypes influences a tumor’s response to treatment, especially immunotherapy. However, the gene expression signatures associated with these CAF subtypes have yet to realize their potential as clinical biomarkers. Here, we describe an interpretable machine learning approach, additive multiple instance learning (aMIL), to predict bulk gene expression signatures from hematoxylin and eosin-stained whole-slide images, focusing on an immunosuppressive LRRC15+ CAF-enriched TGFβ-CAF signature. aMIL models accurately predicted TGFβ-CAF across various cancer types. Tissue regions contributing most highly to slide-level predictions of TGFβ-CAF were evaluated by machine learning models characterizing spatial distributions of diverse cell and tissue types, stromal subtypes, and nuclear morphology. In breast cancer, regions contributing most to TGFβ-CAF-high predictions (“excitatory”) were localized to cancer stroma with high fibroblast density and mature collagen fibers. Regions contributing most to TGFβ-CAF-low predictions (“inhibitory”) were localized to cancer epithelium and densely inflamed stroma. Fibroblast and lymphocyte nuclear morphology also differed between excitatory and inhibitory regions. Thus, aMIL enables a data-driven link between histologic features and transcription, offering biological interpretability beyond typical black-box models.
期刊介绍:
Modern Pathology, an international journal under the ownership of The United States & Canadian Academy of Pathology (USCAP), serves as an authoritative platform for publishing top-tier clinical and translational research studies in pathology.
Original manuscripts are the primary focus of Modern Pathology, complemented by impactful editorials, reviews, and practice guidelines covering all facets of precision diagnostics in human pathology. The journal's scope includes advancements in molecular diagnostics and genomic classifications of diseases, breakthroughs in immune-oncology, computational science, applied bioinformatics, and digital pathology.