Mohammad Alsharid, Harshita Sharma, Lior Drukker, Aris T Papageorgiou, J Alison Noble
{"title":"Weakly Supervised Captioning of Ultrasound Images.","authors":"Mohammad Alsharid, Harshita Sharma, Lior Drukker, Aris T Papageorgiou, J Alison Noble","doi":"10.1007/978-3-031-12053-4_14","DOIUrl":null,"url":null,"abstract":"<p><p>Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on <i>BLEU-1</i> and <i>ROUGE-L</i>. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.</p>","PeriodicalId":74147,"journal":{"name":"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)","volume":"13413 ","pages":"187-198"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614238/pdf/EMS159395.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-12053-4_14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on BLEU-1 and ROUGE-L. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.