{"title":"虚拟胶囊内窥镜合成数据集","authors":"Sarita Singh, Basabi Bhaumik, Shouri Chatterjee","doi":"10.1016/j.media.2025.103706","DOIUrl":null,"url":null,"abstract":"<div><div>The absence of accurately position-annotated datasets of the human gastrointestinal (GI) tract limits the efficient learning of deep learning-based models and their effective performance evaluation for depth and pose estimation. The currently available synthetic datasets for the GI tract lack the intrinsic anatomical features and the associated textural characteristics. In this work, we have developed a method to generate virtual models of the small and large intestines of human gastrointestinal system integrated with a virtual capsule endoscope, that generate position-annotated image dataset (SimIntestine) along with ground truth depth maps. The virtual intestines incorporate the distinctive anatomical characteristics of the real intestines, such as plicae circulares, villi, haustral folds, realistic textures; and the physiological processes such as peristalsis. The virtual endoscope navigates through the virtual intestine analogous to a real capsule endoscope and generates images that closely approximate the visual characteristics of those captured by a real endoscope. The framework additionally provides information on the camera’s orientation and position inside the virtual intestine; along with the depth information for each image pixel. The proposed framework provides a comprehensive and physically realistic annotated synthetic dataset benchmark of intestines which can be used to improve endoscopic video analysis, specifically in the domain of pose estimation and simultaneous localization and mapping which is challenging to obtain using real endoscope unannotated dataset. The SimIntestine dataset is utilized to evaluate the established benchmark techniques for depth and ego-motion estimation - Endo-SfMLearner and Monodepth2, and their results are discussed. The dataset has also been evaluated against other existing datasets, and its efficacy has been quantitatively affirmed by the enhanced performance metrics.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103706"},"PeriodicalIF":11.8000,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SimIntestine: A synthetic dataset from virtual capsule endoscope\",\"authors\":\"Sarita Singh, Basabi Bhaumik, Shouri Chatterjee\",\"doi\":\"10.1016/j.media.2025.103706\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The absence of accurately position-annotated datasets of the human gastrointestinal (GI) tract limits the efficient learning of deep learning-based models and their effective performance evaluation for depth and pose estimation. The currently available synthetic datasets for the GI tract lack the intrinsic anatomical features and the associated textural characteristics. In this work, we have developed a method to generate virtual models of the small and large intestines of human gastrointestinal system integrated with a virtual capsule endoscope, that generate position-annotated image dataset (SimIntestine) along with ground truth depth maps. The virtual intestines incorporate the distinctive anatomical characteristics of the real intestines, such as plicae circulares, villi, haustral folds, realistic textures; and the physiological processes such as peristalsis. The virtual endoscope navigates through the virtual intestine analogous to a real capsule endoscope and generates images that closely approximate the visual characteristics of those captured by a real endoscope. The framework additionally provides information on the camera’s orientation and position inside the virtual intestine; along with the depth information for each image pixel. The proposed framework provides a comprehensive and physically realistic annotated synthetic dataset benchmark of intestines which can be used to improve endoscopic video analysis, specifically in the domain of pose estimation and simultaneous localization and mapping which is challenging to obtain using real endoscope unannotated dataset. The SimIntestine dataset is utilized to evaluate the established benchmark techniques for depth and ego-motion estimation - Endo-SfMLearner and Monodepth2, and their results are discussed. The dataset has also been evaluated against other existing datasets, and its efficacy has been quantitatively affirmed by the enhanced performance metrics.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"105 \",\"pages\":\"Article 103706\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525002531\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002531","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
SimIntestine: A synthetic dataset from virtual capsule endoscope
The absence of accurately position-annotated datasets of the human gastrointestinal (GI) tract limits the efficient learning of deep learning-based models and their effective performance evaluation for depth and pose estimation. The currently available synthetic datasets for the GI tract lack the intrinsic anatomical features and the associated textural characteristics. In this work, we have developed a method to generate virtual models of the small and large intestines of human gastrointestinal system integrated with a virtual capsule endoscope, that generate position-annotated image dataset (SimIntestine) along with ground truth depth maps. The virtual intestines incorporate the distinctive anatomical characteristics of the real intestines, such as plicae circulares, villi, haustral folds, realistic textures; and the physiological processes such as peristalsis. The virtual endoscope navigates through the virtual intestine analogous to a real capsule endoscope and generates images that closely approximate the visual characteristics of those captured by a real endoscope. The framework additionally provides information on the camera’s orientation and position inside the virtual intestine; along with the depth information for each image pixel. The proposed framework provides a comprehensive and physically realistic annotated synthetic dataset benchmark of intestines which can be used to improve endoscopic video analysis, specifically in the domain of pose estimation and simultaneous localization and mapping which is challenging to obtain using real endoscope unannotated dataset. The SimIntestine dataset is utilized to evaluate the established benchmark techniques for depth and ego-motion estimation - Endo-SfMLearner and Monodepth2, and their results are discussed. The dataset has also been evaluated against other existing datasets, and its efficacy has been quantitatively affirmed by the enhanced performance metrics.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.