PCPAm -一个用于分类任务的阴茎癌组织病理图像数据集

IF 1.4 Q3 MULTIDISCIPLINARY SCIENCES
Marcos Gabriel Mendes Lauande , Geraldo Braz Júnior , João Dallyson Sousa de Almeida , Vandecia Rejane Monteiro Fernandes , Anselmo Cardoso de Paiva , Rui Miguel Gil da Costa , Amanda Mara Teles , Leandro Lima da Silva , Haissa Oliveira Brito , Flávia Castello Branco Vidal
{"title":"PCPAm -一个用于分类任务的阴茎癌组织病理图像数据集","authors":"Marcos Gabriel Mendes Lauande ,&nbsp;Geraldo Braz Júnior ,&nbsp;João Dallyson Sousa de Almeida ,&nbsp;Vandecia Rejane Monteiro Fernandes ,&nbsp;Anselmo Cardoso de Paiva ,&nbsp;Rui Miguel Gil da Costa ,&nbsp;Amanda Mara Teles ,&nbsp;Leandro Lima da Silva ,&nbsp;Haissa Oliveira Brito ,&nbsp;Flávia Castello Branco Vidal","doi":"10.1016/j.dib.2025.111823","DOIUrl":null,"url":null,"abstract":"<div><div>Penile cancer has an incidence strongly linked to sociocultural factors, being more common in underdeveloped countries like Brazil, where it represents approximately 2% of cancers affecting men. This dataset was created to address the scarcity of publicly available resources for classifying histopathological images in penile cancer research. The images were collected in 2021 from tissue samples obtained through biopsies of patients undergoing treatment for penile cancer. After staining with Hematoxylin and Eosin (H&amp;E), the tissue samples were photographed using a Leica ICC50 HD camera attached to a bright-field microscope (Leica DM500). The dataset comprises 194 high-resolution images (2048 × 1536 pixels), categorized by magnification (40X and 100X) and pathological classification (Tumor or Non-Tumor). Metadata includes additional information such as histological grade and, for some images, HPV status. Although previous works have focused primarily on binary classification tasks, the dataset includes additional labels, such as histological grade and HPV (Human Papilloma Virus) presence, which provide opportunities for multi-label classification or other types of predictive modelling. These extended labels enhance the dataset’s versatility for more complex tasks in medical image analysis. The dataset holds significant reuse potential for machine learning tasks beyond binary classification, allowing researchers to explore additional layers of analysis, such as HPV detection and histological grading. It can also be used for model benchmarking and comparative studies in cancer research, contributing to developing new diagnostic tools. The dataset and metadata are available for further research and model development.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111823"},"PeriodicalIF":1.4000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PCPAm - A dataset of histopathological images of penile cancer for classification tasks\",\"authors\":\"Marcos Gabriel Mendes Lauande ,&nbsp;Geraldo Braz Júnior ,&nbsp;João Dallyson Sousa de Almeida ,&nbsp;Vandecia Rejane Monteiro Fernandes ,&nbsp;Anselmo Cardoso de Paiva ,&nbsp;Rui Miguel Gil da Costa ,&nbsp;Amanda Mara Teles ,&nbsp;Leandro Lima da Silva ,&nbsp;Haissa Oliveira Brito ,&nbsp;Flávia Castello Branco Vidal\",\"doi\":\"10.1016/j.dib.2025.111823\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Penile cancer has an incidence strongly linked to sociocultural factors, being more common in underdeveloped countries like Brazil, where it represents approximately 2% of cancers affecting men. This dataset was created to address the scarcity of publicly available resources for classifying histopathological images in penile cancer research. The images were collected in 2021 from tissue samples obtained through biopsies of patients undergoing treatment for penile cancer. After staining with Hematoxylin and Eosin (H&amp;E), the tissue samples were photographed using a Leica ICC50 HD camera attached to a bright-field microscope (Leica DM500). The dataset comprises 194 high-resolution images (2048 × 1536 pixels), categorized by magnification (40X and 100X) and pathological classification (Tumor or Non-Tumor). Metadata includes additional information such as histological grade and, for some images, HPV status. Although previous works have focused primarily on binary classification tasks, the dataset includes additional labels, such as histological grade and HPV (Human Papilloma Virus) presence, which provide opportunities for multi-label classification or other types of predictive modelling. These extended labels enhance the dataset’s versatility for more complex tasks in medical image analysis. The dataset holds significant reuse potential for machine learning tasks beyond binary classification, allowing researchers to explore additional layers of analysis, such as HPV detection and histological grading. It can also be used for model benchmarking and comparative studies in cancer research, contributing to developing new diagnostic tools. The dataset and metadata are available for further research and model development.</div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"61 \",\"pages\":\"Article 111823\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352340925005505\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925005505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

阴茎癌的发病率与社会文化因素密切相关,在巴西等欠发达国家更为常见,约占男性癌症的2%。创建此数据集是为了解决在阴茎癌研究中用于分类组织病理图像的公共可用资源的稀缺性。这些图像是在2021年从接受阴茎癌治疗的患者的活检中获得的组织样本中收集的。用苏木精和伊红(H&;E)染色后,用Leica ICC50高清相机连接明场显微镜(Leica DM500)拍摄组织样本。该数据集包括194张高分辨率图像(2048 × 1536像素),按放大倍数(40倍和100倍)和病理分类(肿瘤或非肿瘤)进行分类。元数据包括其他信息,如组织学分级和某些图像的HPV状态。虽然以前的工作主要集中在二元分类任务上,但数据集包括额外的标签,如组织学分级和HPV(人类乳头瘤病毒)的存在,这为多标签分类或其他类型的预测建模提供了机会。这些扩展的标签增强了数据集在医学图像分析中更复杂任务的通用性。该数据集在二元分类之外的机器学习任务中具有重要的重用潜力,允许研究人员探索其他分析层,例如HPV检测和组织学分级。它还可以用于癌症研究中的模型基准和比较研究,有助于开发新的诊断工具。数据集和元数据可用于进一步的研究和模型开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
PCPAm - A dataset of histopathological images of penile cancer for classification tasks
Penile cancer has an incidence strongly linked to sociocultural factors, being more common in underdeveloped countries like Brazil, where it represents approximately 2% of cancers affecting men. This dataset was created to address the scarcity of publicly available resources for classifying histopathological images in penile cancer research. The images were collected in 2021 from tissue samples obtained through biopsies of patients undergoing treatment for penile cancer. After staining with Hematoxylin and Eosin (H&E), the tissue samples were photographed using a Leica ICC50 HD camera attached to a bright-field microscope (Leica DM500). The dataset comprises 194 high-resolution images (2048 × 1536 pixels), categorized by magnification (40X and 100X) and pathological classification (Tumor or Non-Tumor). Metadata includes additional information such as histological grade and, for some images, HPV status. Although previous works have focused primarily on binary classification tasks, the dataset includes additional labels, such as histological grade and HPV (Human Papilloma Virus) presence, which provide opportunities for multi-label classification or other types of predictive modelling. These extended labels enhance the dataset’s versatility for more complex tasks in medical image analysis. The dataset holds significant reuse potential for machine learning tasks beyond binary classification, allowing researchers to explore additional layers of analysis, such as HPV detection and histological grading. It can also be used for model benchmarking and comparative studies in cancer research, contributing to developing new diagnostic tools. The dataset and metadata are available for further research and model development.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信