{"title":"基于对比和领域对抗学习的细胞绘画数据的三效校正","authors":"Chengwei Yan, Yu Zhang, Jiuxin Feng, Heyang Hua, Zhihan Ruan, Zhen Li, Siyu Li, Chaoyang Yan, Pingjing Li, Jian Liu, Shengquan Chen","doi":"10.1038/s41467-025-62193-z","DOIUrl":null,"url":null,"abstract":"<p>Cell Painting (CP), as a high-throughput imaging technology, generates extensive cell-stained imaging data, providing unique morphological insights for biological research. However, CP data contains three types of technical effects, referred to as triple effects, including batch effects, gradient-influenced row and column effects (well-position effects). The interaction of various technical effects can obscure true biological signals and complicate the characterization of CP data, making correction essential for reliable analysis. Here, we propose cpDistiller, a triple-effect correction method specially designed for CP data, which leverages a pre-trained segmentation model coupled with a semi-supervised Gaussian mixture variational autoencoder utilizing contrastive and domain-adversarial learning. Through extensive qualitative and quantitative experiments across various CP profiles, we demonstrate that cpDistiller effectively corrects triple effects, especially well-position effects, while preserving cellular heterogeneity. Moreover, cpDistiller effectively captures system-level phenotypic responses to genetic perturbations and reliably infers gene functions and interactions both when combined with scRNA-seq data and independently. cpDistiller also demonstrates promising capability for identifying gene and compound targets, highlighting its potential utility in drug discovery and broader biological research.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"2 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Triple-effect correction for Cell Painting data with contrastive and domain-adversarial learning\",\"authors\":\"Chengwei Yan, Yu Zhang, Jiuxin Feng, Heyang Hua, Zhihan Ruan, Zhen Li, Siyu Li, Chaoyang Yan, Pingjing Li, Jian Liu, Shengquan Chen\",\"doi\":\"10.1038/s41467-025-62193-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Cell Painting (CP), as a high-throughput imaging technology, generates extensive cell-stained imaging data, providing unique morphological insights for biological research. However, CP data contains three types of technical effects, referred to as triple effects, including batch effects, gradient-influenced row and column effects (well-position effects). The interaction of various technical effects can obscure true biological signals and complicate the characterization of CP data, making correction essential for reliable analysis. Here, we propose cpDistiller, a triple-effect correction method specially designed for CP data, which leverages a pre-trained segmentation model coupled with a semi-supervised Gaussian mixture variational autoencoder utilizing contrastive and domain-adversarial learning. Through extensive qualitative and quantitative experiments across various CP profiles, we demonstrate that cpDistiller effectively corrects triple effects, especially well-position effects, while preserving cellular heterogeneity. Moreover, cpDistiller effectively captures system-level phenotypic responses to genetic perturbations and reliably infers gene functions and interactions both when combined with scRNA-seq data and independently. cpDistiller also demonstrates promising capability for identifying gene and compound targets, highlighting its potential utility in drug discovery and broader biological research.</p>\",\"PeriodicalId\":19066,\"journal\":{\"name\":\"Nature Communications\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":15.7000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Communications\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41467-025-62193-z\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-62193-z","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Triple-effect correction for Cell Painting data with contrastive and domain-adversarial learning
Cell Painting (CP), as a high-throughput imaging technology, generates extensive cell-stained imaging data, providing unique morphological insights for biological research. However, CP data contains three types of technical effects, referred to as triple effects, including batch effects, gradient-influenced row and column effects (well-position effects). The interaction of various technical effects can obscure true biological signals and complicate the characterization of CP data, making correction essential for reliable analysis. Here, we propose cpDistiller, a triple-effect correction method specially designed for CP data, which leverages a pre-trained segmentation model coupled with a semi-supervised Gaussian mixture variational autoencoder utilizing contrastive and domain-adversarial learning. Through extensive qualitative and quantitative experiments across various CP profiles, we demonstrate that cpDistiller effectively corrects triple effects, especially well-position effects, while preserving cellular heterogeneity. Moreover, cpDistiller effectively captures system-level phenotypic responses to genetic perturbations and reliably infers gene functions and interactions both when combined with scRNA-seq data and independently. cpDistiller also demonstrates promising capability for identifying gene and compound targets, highlighting its potential utility in drug discovery and broader biological research.
期刊介绍:
Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.