Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing最新文献_第2页

Transformer based Generative Adversarial Network for Liver Segmentation 基于变压器的生成对抗网络肝脏分割

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-05-01 DOI: 10.48550/arXiv.2205.10663

Ugur Demir, Zheyu Zhang, Bin Wang, M. Antalek, Elif Keles, Debesh Jha, A. Borhani, D. Ladner, Ulas Bagci

{"title":"Transformer based Generative Adversarial Network for Liver Segmentation","authors":"Ugur Demir, Zheyu Zhang, Bin Wang, M. Antalek, Elif Keles, Debesh Jha, A. Borhani, D. Ladner, Ulas Bagci","doi":"10.48550/arXiv.2205.10663","DOIUrl":"https://doi.org/10.48550/arXiv.2205.10663","url":null,"abstract":"Automated liver segmentation from radiology scans (CT, MRI) can improve surgery and therapy planning and follow-up assessment in addition to conventional use for diagnosis and prognosis. Although convolutional neural networks (CNNs) have became the standard image segmentation tasks, more recently this has started to change towards Transformers based architectures because Transformers are taking advantage of capturing long range dependence modeling capability in signals, so called attention mechanism. In this study, we propose a new segmentation approach using a hybrid approach combining the Transformer(s) with the Generative Adversarial Network (GAN) approach. The premise behind this choice is that the self-attention mechanism of the Transformers allows the network to aggregate the high dimensional feature and provide global information modeling. This mechanism provides better segmentation performance compared with traditional methods. Furthermore, we encode this generator into the GAN based architecture so that the discriminator network in the GAN can classify the credibility of the generated segmentation masks compared with the real masks coming from human (expert) annotations. This allows us to extract the high dimensional topology information in the mask for biomedical image segmentation and provide more reliable segmentation results. Our model achieved a high dice coefficient of 0.9433, recall of 0.9515, and precision of 0.9376 and outperformed other Transformer based approaches. The implementation details of the proposed architecture can be found at https://github.com/UgurDemir/tranformer_liver_segmentation.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"27 1","pages":"340-347"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78950116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Transformer based Generative Adversarial Network for Liver Segmentation. 基于变压器的生成式对抗网络用于肝脏分割。

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-05-01 Epub Date: 2022-08-04 DOI: 10.1007/978-3-031-13324-4_29

Ugur Demir, Zheyuan Zhang, Bin Wang, Matthew Antalek, Elif Keles, Debesh Jha, Amir Borhani, Daniela Ladner, Ulas Bagci

{"title":"Transformer based Generative Adversarial Network for Liver Segmentation.","authors":"Ugur Demir, Zheyuan Zhang, Bin Wang, Matthew Antalek, Elif Keles, Debesh Jha, Amir Borhani, Daniela Ladner, Ulas Bagci","doi":"10.1007/978-3-031-13324-4_29","DOIUrl":"10.1007/978-3-031-13324-4_29","url":null,"abstract":"<p><p>Automated liver segmentation from radiology scans (CT, MRI) can improve surgery and therapy planning and follow-up assessment in addition to conventional use for diagnosis and prognosis. Although convolutional neural networks (CNNs) have became the standard image segmentation tasks, more recently this has started to change towards Transformers based architectures because Transformers are taking advantage of capturing long range dependence modeling capability in signals, so called attention mechanism. In this study, we propose a new segmentation approach using a hybrid approach combining the Transformer(s) with the Generative Adversarial Network (GAN) approach. The premise behind this choice is that the self-attention mechanism of the Transformers allows the network to aggregate the high dimensional feature and provide global information modeling. This mechanism provides better segmentation performance compared with traditional methods. Furthermore, we encode this generator into the GAN based architecture so that the discriminator network in the GAN can classify the credibility of the generated segmentation masks compared with the real masks coming from human (expert) annotations. This allows us to extract the high dimensional topology information in the mask for biomedical image segmentation and provide more reliable segmentation results. Our model achieved a high dice coefficient of 0.9433, recall of 0.9515, and precision of 0.9376 and outperformed other Transformer based approaches. The implementation details of the proposed architecture can be found at https://github.com/UgurDemir/tranformer_liver_segmentation.</p>","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"13374 ","pages":"340-347"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9894332/pdf/nihms-1866463.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10718779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FasterVideo: Efficient Online Joint Object Detection And Tracking 快速视频:高效的在线联合目标检测和跟踪

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-15 DOI: 10.1007/978-3-031-06433-3_32

Issa Mouawad, F. Odone

引用次数: 3

Egocentric Human-Object Interaction Detection Exploiting Synthetic Data 基于合成数据的自我中心人-物交互检测

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-14 DOI: 10.48550/arXiv.2204.07061

Rosario Leonardi, F. Ragusa, Antonino Furnari, G. Farinella

引用次数: 11

Weakly Supervised Attended Object Detection Using Gaze Data as Annotations 使用注视数据作为注释的弱监督参与对象检测

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-14 DOI: 10.48550/arXiv.2204.07090

Michele Mazzamuto, F. Ragusa, Antonino Furnari, G. Signorello, G. Farinella

{"title":"Weakly Supervised Attended Object Detection Using Gaze Data as Annotations","authors":"Michele Mazzamuto, F. Ragusa, Antonino Furnari, G. Signorello, G. Farinella","doi":"10.48550/arXiv.2204.07090","DOIUrl":"https://doi.org/10.48550/arXiv.2204.07090","url":null,"abstract":"We consider the problem of detecting and recognizing the objects observed by visitors (i.e., attended objects) in cultural sites from egocentric vision. A standard approach to the problem involves detecting all objects and selecting the one which best overlaps with the gaze of the visitor, measured through a gaze tracker. Since labeling large amounts of data to train a standard object detector is expensive in terms of costs and time, we propose a weakly supervised version of the task which leans only on gaze data and a frame-level label indicating the class of the attended object. To study the problem, we present a new dataset composed of egocentric videos and gaze coordinates of subjects visiting a museum. We hence compare three different baselines for weakly supervised attended object detection on the collected data. Results show that the considered approaches achieve satisfactory performance in a weakly supervised manner, which allows for significant time savings with respect to a fully supervised detector based on Faster R-CNN. To encourage research on the topic, we publicly release the code and the dataset at the following url: https://iplab.dmi.unict.it/WS_OBJ_DET/","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"78 1","pages":"263-274"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89632364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Underwater Image Enhancement Using Pre-trained Transformer 水下图像增强使用预训练变压器

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-08 DOI: 10.48550/arXiv.2204.04199

Abderrahmene Boudiaf, Yu Guo, Adarsh Ghimire, N. Werghi, G. Masi, S. Javed, J. Dias

引用次数: 2

Engagement Detection with Multi-Task Training in E-Learning Environments 电子学习环境下多任务训练的敬业度检测

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-08 DOI: 10.48550/arXiv.2204.04020

Onur Çopur, Mert Nakıp, Simone Scardapane, Jürgen Slowack

引用次数: 4

Online panoptic 3D reconstruction as a Linear Assignment Problem 作为线性分配问题的在线全视三维重建

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-04-01 DOI: 10.48550/arXiv.2204.00231

Leevi Raivio, Esa Rahtu

{"title":"Online panoptic 3D reconstruction as a Linear Assignment Problem","authors":"Leevi Raivio, Esa Rahtu","doi":"10.48550/arXiv.2204.00231","DOIUrl":"https://doi.org/10.48550/arXiv.2204.00231","url":null,"abstract":". Real-time holistic scene understanding would allow machines to interpret their surrounding in a much more detailed manner than is currently possible. While panoptic image segmentation methods have brought image segmentation closer to this goal, this information has to be described relative to the 3D environment for the machine to be able to utilise it eﬀectively. In this paper, we investigate methods for sequentially reconstructing static environments from panoptic image segmentations in 3D. We speciﬁcally target real-time operation: the algorithm must process data strictly online and be able to run at relatively fast frame rates. Additionally, the method should be scalable for environments large enough for practical applications. By applying a simple but powerful data-association algorithm, we outperform earlier similar works when operating purely online. Our method is also capable of reaching frame-rates high enough for real-time applications and is scalable to larger environments as well. Source code and further demonstrations are released to the public at: https://tutvision.github.io/Online-Panoptic-3D/","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"43 1","pages":"39-50"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81398832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Medicinal Boxes Recognition on a Deep Transfer Learning Augmented Reality Mobile Application 基于深度迁移学习增强现实移动应用的药盒识别

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-03-26 DOI: 10.48550/arXiv.2203.14031

D. Avola, L. Cinque, Alessio Fagioli, G. Foresti, Marco Raoul Marini, Alessio Mecca, D. Pannone

{"title":"Medicinal Boxes Recognition on a Deep Transfer Learning Augmented Reality Mobile Application","authors":"D. Avola, L. Cinque, Alessio Fagioli, G. Foresti, Marco Raoul Marini, Alessio Mecca, D. Pannone","doi":"10.48550/arXiv.2203.14031","DOIUrl":"https://doi.org/10.48550/arXiv.2203.14031","url":null,"abstract":"Taking medicines is a fundamental aspect to cure illnesses. However, studies have shown that it can be hard for patients to remember the correct posology. More aggravating, a wrong dosage generally causes the disease to worsen. Although, all relevant instructions for a medicine are summarized in the corresponding patient information leaflet, the latter is generally difficult to navigate and understand. To address this problem and help patients with their medication, in this paper we introduce an augmented reality mobile application that can present to the user important details on the framed medicine. In particular, the app implements an inference engine based on a deep neural network, i.e., a densenet, fine-tuned to recognize a medicinal from its package. Subsequently, relevant information, such as posology or a simplified leaflet, is overlaid on the camera feed to help a patient when taking a medicine. Extensive experiments to select the best hyperparameters were performed on a dataset specifically collected to address this task; ultimately obtaining up to 91.30% accuracy as well as real-time capabilities.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"75 1","pages":"489-499"},"PeriodicalIF":0.0,"publicationDate":"2022-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90784289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deepfake Style Transfer Mixture: a First Forensic Ballistics Study on Synthetic Images Deepfake风格转移混合物:合成图像的首次法医弹道研究

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-03-18 DOI: 10.1007/978-3-031-06430-2_13

Luca Guarnera, O. Giudice, S. Battiato

引用次数: 3