Mohammad Alsharid, Harshita Sharma, Lior Drukker, Aris T Papageorgiou, J Alison Noble
{"title":"超声图像的弱监督字幕。","authors":"Mohammad Alsharid, Harshita Sharma, Lior Drukker, Aris T Papageorgiou, J Alison Noble","doi":"10.1007/978-3-031-12053-4_14","DOIUrl":null,"url":null,"abstract":"<p><p>Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on <i>BLEU-1</i> and <i>ROUGE-L</i>. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.</p>","PeriodicalId":74147,"journal":{"name":"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)","volume":"13413 ","pages":"187-198"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614238/pdf/EMS159395.pdf","citationCount":"0","resultStr":"{\"title\":\"Weakly Supervised Captioning of Ultrasound Images.\",\"authors\":\"Mohammad Alsharid, Harshita Sharma, Lior Drukker, Aris T Papageorgiou, J Alison Noble\",\"doi\":\"10.1007/978-3-031-12053-4_14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on <i>BLEU-1</i> and <i>ROUGE-L</i>. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.</p>\",\"PeriodicalId\":74147,\"journal\":{\"name\":\"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)\",\"volume\":\"13413 \",\"pages\":\"187-198\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614238/pdf/EMS159395.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/978-3-031-12053-4_14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image understanding and analysis : 26th annual conference, MIUA 2022, Cambridge, UK, July 27-29, 2022, proceedings. Medical Image Understanding and Analysis (Conference) (26th : 2022 : Cambridge, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-12053-4_14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Weakly Supervised Captioning of Ultrasound Images.
Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on BLEU-1 and ROUGE-L. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.