Abu Kowshir Bitto , Md. Hasan Imam Bijoy , Kamrul Hassan Shakil , Aka Das , Khalid Been Badruzzaman Biplob , Imran Mahmud , Syed Md. Minhaz Hossain
{"title":"GastroEndoNet:用于胃食管反流和息肉检测的综合内镜图像数据集","authors":"Abu Kowshir Bitto , Md. Hasan Imam Bijoy , Kamrul Hassan Shakil , Aka Das , Khalid Been Badruzzaman Biplob , Imran Mahmud , Syed Md. Minhaz Hossain","doi":"10.1016/j.dib.2025.111572","DOIUrl":null,"url":null,"abstract":"<div><div>The gastrointestinal (GI) system is fundamental to human health, supporting digestion, nutrient absorption, and waste elimination. Disruptions in GI function, such as Gastroesophageal Reflux Disease (GERD) and gastrointestinal polyps, can lead to significant health complications if not diagnosed and managed early. However, manual interpretation of endoscopic images is time-consuming and prone to human error, highlighting the need for automated diagnostic tools. In this study, we introduce a comprehensive dataset of 24,036 high-quality endoscopic images, categorized into four classes: GERD, GERD Normal, Polyp, and Polyp Normal. This dataset is designed to facilitate research in automated detection and classification of these conditions through machine learning algorithms. The dataset consists of 4006 primary images collected following endoscopic procedures, which were augmented using six distinct techniques, expanding the total number of images to 24,036. It includes 5844 images of GERD cases (974primary images), 6618 images of GERD Normal (1103 primary images), 4674 images of Polyps (779 primary images), and 6900 images of Polyp Normal (1150 primary images). These images, pre-processed and resized to a resolution of 512 × 512 pixels, were obtained from Zainul Haque Sikder Women’s Medical College & Hospital (Pvt.) Ltd. and saved in JPG format. This dataset addresses a critical gap in the availability of large, diverse, and well-labelled medical image datasets for training AI-driven healthcare solutions. It provides an invaluable resource for developing machine learning models aimed at the automatic diagnosis, classification, and detection of GERD and polyps, potentially improving the speed and accuracy of clinical decision-making. By leveraging this dataset, researchers can contribute to enhanced diagnostic tools that could significantly improve healthcare outcomes and patient quality of life in the field of gastroenterology.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111572"},"PeriodicalIF":1.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GastroEndoNet: Comprehensive endoscopy image dataset for GERD and polyp detection\",\"authors\":\"Abu Kowshir Bitto , Md. Hasan Imam Bijoy , Kamrul Hassan Shakil , Aka Das , Khalid Been Badruzzaman Biplob , Imran Mahmud , Syed Md. Minhaz Hossain\",\"doi\":\"10.1016/j.dib.2025.111572\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The gastrointestinal (GI) system is fundamental to human health, supporting digestion, nutrient absorption, and waste elimination. Disruptions in GI function, such as Gastroesophageal Reflux Disease (GERD) and gastrointestinal polyps, can lead to significant health complications if not diagnosed and managed early. However, manual interpretation of endoscopic images is time-consuming and prone to human error, highlighting the need for automated diagnostic tools. In this study, we introduce a comprehensive dataset of 24,036 high-quality endoscopic images, categorized into four classes: GERD, GERD Normal, Polyp, and Polyp Normal. This dataset is designed to facilitate research in automated detection and classification of these conditions through machine learning algorithms. The dataset consists of 4006 primary images collected following endoscopic procedures, which were augmented using six distinct techniques, expanding the total number of images to 24,036. It includes 5844 images of GERD cases (974primary images), 6618 images of GERD Normal (1103 primary images), 4674 images of Polyps (779 primary images), and 6900 images of Polyp Normal (1150 primary images). These images, pre-processed and resized to a resolution of 512 × 512 pixels, were obtained from Zainul Haque Sikder Women’s Medical College & Hospital (Pvt.) Ltd. and saved in JPG format. This dataset addresses a critical gap in the availability of large, diverse, and well-labelled medical image datasets for training AI-driven healthcare solutions. It provides an invaluable resource for developing machine learning models aimed at the automatic diagnosis, classification, and detection of GERD and polyps, potentially improving the speed and accuracy of clinical decision-making. By leveraging this dataset, researchers can contribute to enhanced diagnostic tools that could significantly improve healthcare outcomes and patient quality of life in the field of gastroenterology.</div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"60 \",\"pages\":\"Article 111572\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S235234092500304X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235234092500304X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
胃肠道(GI)系统是人体健康的基础,支持消化,营养吸收和废物消除。胃肠道功能紊乱,如胃食管反流病(GERD)和胃肠道息肉,如果不及早诊断和治疗,可导致严重的健康并发症。然而,内窥镜图像的人工解释既耗时又容易出现人为错误,这突出了对自动化诊断工具的需求。在本研究中,我们引入了一个包含24036张高质量内镜图像的综合数据集,将其分为四类:GERD、GERD Normal、Polyp和Polyp Normal。该数据集旨在通过机器学习算法促进这些条件的自动检测和分类研究。该数据集由4006张内窥镜手术后收集的主图像组成,使用六种不同的技术对其进行增强,将图像总数扩展到24036张。其中GERD 5844张(原发图像974张),GERD Normal 6618张(原发图像1103张),Polyps 4674张(原发图像779张),Polyp Normal 6900张(原发图像1150张)。这些图像经过预处理并调整为512 × 512像素的分辨率,从Zainul Haque Sikder女子医学院获得。医院(私人)有限公司并以JPG格式保存。该数据集解决了用于训练人工智能驱动的医疗保健解决方案的大型、多样化和标记良好的医学图像数据集可用性方面的关键差距。它为开发旨在自动诊断、分类和检测胃食管反流和息肉的机器学习模型提供了宝贵的资源,有可能提高临床决策的速度和准确性。通过利用该数据集,研究人员可以为增强的诊断工具做出贡献,这些工具可以显着改善胃肠病学领域的医疗保健结果和患者的生活质量。
GastroEndoNet: Comprehensive endoscopy image dataset for GERD and polyp detection
The gastrointestinal (GI) system is fundamental to human health, supporting digestion, nutrient absorption, and waste elimination. Disruptions in GI function, such as Gastroesophageal Reflux Disease (GERD) and gastrointestinal polyps, can lead to significant health complications if not diagnosed and managed early. However, manual interpretation of endoscopic images is time-consuming and prone to human error, highlighting the need for automated diagnostic tools. In this study, we introduce a comprehensive dataset of 24,036 high-quality endoscopic images, categorized into four classes: GERD, GERD Normal, Polyp, and Polyp Normal. This dataset is designed to facilitate research in automated detection and classification of these conditions through machine learning algorithms. The dataset consists of 4006 primary images collected following endoscopic procedures, which were augmented using six distinct techniques, expanding the total number of images to 24,036. It includes 5844 images of GERD cases (974primary images), 6618 images of GERD Normal (1103 primary images), 4674 images of Polyps (779 primary images), and 6900 images of Polyp Normal (1150 primary images). These images, pre-processed and resized to a resolution of 512 × 512 pixels, were obtained from Zainul Haque Sikder Women’s Medical College & Hospital (Pvt.) Ltd. and saved in JPG format. This dataset addresses a critical gap in the availability of large, diverse, and well-labelled medical image datasets for training AI-driven healthcare solutions. It provides an invaluable resource for developing machine learning models aimed at the automatic diagnosis, classification, and detection of GERD and polyps, potentially improving the speed and accuracy of clinical decision-making. By leveraging this dataset, researchers can contribute to enhanced diagnostic tools that could significantly improve healthcare outcomes and patient quality of life in the field of gastroenterology.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.