CellBinDB:用于通用模型基准测试的大规模多模态注释数据集。

IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES
Can Shi, Jinghong Fan, Zhonghan Deng, Huanlin Liu, Qiang Kang, Yumei Li, Jing Guo, Jingwen Wang, Jinjiang Gong, Sha Liao, Ao Chen, Ying Zhang, Mei Li
{"title":"CellBinDB:用于通用模型基准测试的大规模多模态注释数据集。","authors":"Can Shi, Jinghong Fan, Zhonghan Deng, Huanlin Liu, Qiang Kang, Yumei Li, Jing Guo, Jingwen Wang, Jinjiang Gong, Sha Liao, Ao Chen, Ying Zhang, Mei Li","doi":"10.1093/gigascience/giaf069","DOIUrl":null,"url":null,"abstract":"<p><p>In recent years, cell segmentation techniques have played a critical role in the analysis of biological images, especially for quantitative studies. Deep learning-based cell segmentation models have demonstrated remarkable performance in segmenting cell and nucleus boundaries, but they are typically tailored to specific modalities or require manual tuning of hyperparameters, limiting their generalizability to unseen data. Comprehensive datasets that support both the training of universal models and the evaluation of various segmentation techniques are essential for overcoming these limitations and promoting the development of more versatile cell segmentation solutions. Here, we present CellBinDB, a large-scale multimodal annotated dataset established for these purposes. CellBinDB contains more than 1,000 annotated images, each labeled to identify the boundaries of cells or nuclei, including 4',6-diamidino-2-phenylindole, single-stranded DNA, hematoxylin and eosin, and multiplex immunofluorescence staining, covering over 30 normal and diseased tissue types from human and mouse samples. Based on CellBinDB, we benchmarked 8 state-of-the-art and widely used cell segmentation technologies/methods, and our further analysis reveals that complex cell shapes reduce segmentation accuracy while higher image gradients improve boundary detection, offering insights for refining segmentation strategies across diverse imaging scenarios.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206155/pdf/","citationCount":"0","resultStr":"{\"title\":\"CellBinDB: a large-scale multimodal annotated dataset for cell segmentation with benchmarking of universal models.\",\"authors\":\"Can Shi, Jinghong Fan, Zhonghan Deng, Huanlin Liu, Qiang Kang, Yumei Li, Jing Guo, Jingwen Wang, Jinjiang Gong, Sha Liao, Ao Chen, Ying Zhang, Mei Li\",\"doi\":\"10.1093/gigascience/giaf069\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In recent years, cell segmentation techniques have played a critical role in the analysis of biological images, especially for quantitative studies. Deep learning-based cell segmentation models have demonstrated remarkable performance in segmenting cell and nucleus boundaries, but they are typically tailored to specific modalities or require manual tuning of hyperparameters, limiting their generalizability to unseen data. Comprehensive datasets that support both the training of universal models and the evaluation of various segmentation techniques are essential for overcoming these limitations and promoting the development of more versatile cell segmentation solutions. Here, we present CellBinDB, a large-scale multimodal annotated dataset established for these purposes. CellBinDB contains more than 1,000 annotated images, each labeled to identify the boundaries of cells or nuclei, including 4',6-diamidino-2-phenylindole, single-stranded DNA, hematoxylin and eosin, and multiplex immunofluorescence staining, covering over 30 normal and diseased tissue types from human and mouse samples. Based on CellBinDB, we benchmarked 8 state-of-the-art and widely used cell segmentation technologies/methods, and our further analysis reveals that complex cell shapes reduce segmentation accuracy while higher image gradients improve boundary detection, offering insights for refining segmentation strategies across diverse imaging scenarios.</p>\",\"PeriodicalId\":12581,\"journal\":{\"name\":\"GigaScience\",\"volume\":\"14 \",\"pages\":\"\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206155/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GigaScience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gigascience/giaf069\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giaf069","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

近年来,细胞分割技术在生物图像分析,特别是定量研究中发挥了关键作用。基于深度学习的细胞分割模型在分割细胞和细胞核边界方面表现出了卓越的性能,但它们通常是针对特定的模式量身定制的,或者需要手动调整超参数,限制了它们对未知数据的泛化能力。支持通用模型训练和各种分割技术评估的综合数据集对于克服这些限制和促进更通用的细胞分割解决方案的发展至关重要。在这里,我们提出了CellBinDB,一个为这些目的而建立的大规模多模态注释数据集。CellBinDB包含1000多张带注释的图像,每张图像都被标记以识别细胞或细胞核的边界,包括4',6-二氨基-2-苯基吲哚,单链DNA,苏木精和伊红,多重免疫荧光染色,覆盖了来自人和小鼠样本的30多种正常和病变组织类型。基于CellBinDB,我们对8种最先进和广泛使用的细胞分割技术/方法进行了基准测试,我们的进一步分析表明,复杂的细胞形状降低了分割精度,而更高的图像梯度提高了边界检测,为改进不同成像场景下的分割策略提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CellBinDB: a large-scale multimodal annotated dataset for cell segmentation with benchmarking of universal models.

In recent years, cell segmentation techniques have played a critical role in the analysis of biological images, especially for quantitative studies. Deep learning-based cell segmentation models have demonstrated remarkable performance in segmenting cell and nucleus boundaries, but they are typically tailored to specific modalities or require manual tuning of hyperparameters, limiting their generalizability to unseen data. Comprehensive datasets that support both the training of universal models and the evaluation of various segmentation techniques are essential for overcoming these limitations and promoting the development of more versatile cell segmentation solutions. Here, we present CellBinDB, a large-scale multimodal annotated dataset established for these purposes. CellBinDB contains more than 1,000 annotated images, each labeled to identify the boundaries of cells or nuclei, including 4',6-diamidino-2-phenylindole, single-stranded DNA, hematoxylin and eosin, and multiplex immunofluorescence staining, covering over 30 normal and diseased tissue types from human and mouse samples. Based on CellBinDB, we benchmarked 8 state-of-the-art and widely used cell segmentation technologies/methods, and our further analysis reveals that complex cell shapes reduce segmentation accuracy while higher image gradients improve boundary detection, offering insights for refining segmentation strategies across diverse imaging scenarios.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
GigaScience
GigaScience MULTIDISCIPLINARY SCIENCES-
CiteScore
15.50
自引率
1.10%
发文量
119
审稿时长
1 weeks
期刊介绍: GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信