Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing最新文献

筛选
英文 中文
Transformer based Generative Adversarial Network for Liver Segmentation 基于变压器的生成对抗网络肝脏分割
Ugur Demir, Zheyu Zhang, Bin Wang, M. Antalek, Elif Keles, Debesh Jha, A. Borhani, D. Ladner, Ulas Bagci
{"title":"Transformer based Generative Adversarial Network for Liver Segmentation","authors":"Ugur Demir, Zheyu Zhang, Bin Wang, M. Antalek, Elif Keles, Debesh Jha, A. Borhani, D. Ladner, Ulas Bagci","doi":"10.48550/arXiv.2205.10663","DOIUrl":"https://doi.org/10.48550/arXiv.2205.10663","url":null,"abstract":"Automated liver segmentation from radiology scans (CT, MRI) can improve surgery and therapy planning and follow-up assessment in addition to conventional use for diagnosis and prognosis. Although convolutional neural networks (CNNs) have became the standard image segmentation tasks, more recently this has started to change towards Transformers based architectures because Transformers are taking advantage of capturing long range dependence modeling capability in signals, so called attention mechanism. In this study, we propose a new segmentation approach using a hybrid approach combining the Transformer(s) with the Generative Adversarial Network (GAN) approach. The premise behind this choice is that the self-attention mechanism of the Transformers allows the network to aggregate the high dimensional feature and provide global information modeling. This mechanism provides better segmentation performance compared with traditional methods. Furthermore, we encode this generator into the GAN based architecture so that the discriminator network in the GAN can classify the credibility of the generated segmentation masks compared with the real masks coming from human (expert) annotations. This allows us to extract the high dimensional topology information in the mask for biomedical image segmentation and provide more reliable segmentation results. Our model achieved a high dice coefficient of 0.9433, recall of 0.9515, and precision of 0.9376 and outperformed other Transformer based approaches. The implementation details of the proposed architecture can be found at https://github.com/UgurDemir/tranformer_liver_segmentation.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"27 1","pages":"340-347"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78950116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Transformer based Generative Adversarial Network for Liver Segmentation. 基于变压器的生成式对抗网络用于肝脏分割。
Ugur Demir, Zheyuan Zhang, Bin Wang, Matthew Antalek, Elif Keles, Debesh Jha, Amir Borhani, Daniela Ladner, Ulas Bagci
{"title":"Transformer based Generative Adversarial Network for Liver Segmentation.","authors":"Ugur Demir, Zheyuan Zhang, Bin Wang, Matthew Antalek, Elif Keles, Debesh Jha, Amir Borhani, Daniela Ladner, Ulas Bagci","doi":"10.1007/978-3-031-13324-4_29","DOIUrl":"10.1007/978-3-031-13324-4_29","url":null,"abstract":"<p><p>Automated liver segmentation from radiology scans (CT, MRI) can improve surgery and therapy planning and follow-up assessment in addition to conventional use for diagnosis and prognosis. Although convolutional neural networks (CNNs) have became the standard image segmentation tasks, more recently this has started to change towards Transformers based architectures because Transformers are taking advantage of capturing long range dependence modeling capability in signals, so called attention mechanism. In this study, we propose a new segmentation approach using a hybrid approach combining the Transformer(s) with the Generative Adversarial Network (GAN) approach. The premise behind this choice is that the self-attention mechanism of the Transformers allows the network to aggregate the high dimensional feature and provide global information modeling. This mechanism provides better segmentation performance compared with traditional methods. Furthermore, we encode this generator into the GAN based architecture so that the discriminator network in the GAN can classify the credibility of the generated segmentation masks compared with the real masks coming from human (expert) annotations. This allows us to extract the high dimensional topology information in the mask for biomedical image segmentation and provide more reliable segmentation results. Our model achieved a high dice coefficient of 0.9433, recall of 0.9515, and precision of 0.9376 and outperformed other Transformer based approaches. The implementation details of the proposed architecture can be found at https://github.com/UgurDemir/tranformer_liver_segmentation.</p>","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"13374 ","pages":"340-347"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9894332/pdf/nihms-1866463.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10718779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FasterVideo: Efficient Online Joint Object Detection And Tracking 快速视频:高效的在线联合目标检测和跟踪
Issa Mouawad, F. Odone
{"title":"FasterVideo: Efficient Online Joint Object Detection And Tracking","authors":"Issa Mouawad, F. Odone","doi":"10.1007/978-3-031-06433-3_32","DOIUrl":"https://doi.org/10.1007/978-3-031-06433-3_32","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"18 1","pages":"375-387"},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80941128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Egocentric Human-Object Interaction Detection Exploiting Synthetic Data 基于合成数据的自我中心人-物交互检测
Rosario Leonardi, F. Ragusa, Antonino Furnari, G. Farinella
{"title":"Egocentric Human-Object Interaction Detection Exploiting Synthetic Data","authors":"Rosario Leonardi, F. Ragusa, Antonino Furnari, G. Farinella","doi":"10.48550/arXiv.2204.07061","DOIUrl":"https://doi.org/10.48550/arXiv.2204.07061","url":null,"abstract":"We consider the problem of detecting Egocentric HumanObject Interactions (EHOIs) in industrial contexts. Since collecting and labeling large amounts of real images is challenging, we propose a pipeline and a tool to generate photo-realistic synthetic First Person Vision (FPV) images automatically labeled for EHOI detection in a specific industrial scenario. To tackle the problem of EHOI detection, we propose a method that detects the hands, the objects in the scene, and determines which objects are currently involved in an interaction. We compare the performance of our method with a set of state-of-the-art baselines. Results show that using a synthetic dataset improves the performance of an EHOI detection system, especially when few real data are available. To encourage research on this topic, we publicly release the proposed dataset at the following url: https://iplab.dmi.unict.it/EHOI_SYNTH/.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"83 1","pages":"237-248"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85920381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Weakly Supervised Attended Object Detection Using Gaze Data as Annotations 使用注视数据作为注释的弱监督参与对象检测
Michele Mazzamuto, F. Ragusa, Antonino Furnari, G. Signorello, G. Farinella
{"title":"Weakly Supervised Attended Object Detection Using Gaze Data as Annotations","authors":"Michele Mazzamuto, F. Ragusa, Antonino Furnari, G. Signorello, G. Farinella","doi":"10.48550/arXiv.2204.07090","DOIUrl":"https://doi.org/10.48550/arXiv.2204.07090","url":null,"abstract":"We consider the problem of detecting and recognizing the objects observed by visitors (i.e., attended objects) in cultural sites from egocentric vision. A standard approach to the problem involves detecting all objects and selecting the one which best overlaps with the gaze of the visitor, measured through a gaze tracker. Since labeling large amounts of data to train a standard object detector is expensive in terms of costs and time, we propose a weakly supervised version of the task which leans only on gaze data and a frame-level label indicating the class of the attended object. To study the problem, we present a new dataset composed of egocentric videos and gaze coordinates of subjects visiting a museum. We hence compare three different baselines for weakly supervised attended object detection on the collected data. Results show that the considered approaches achieve satisfactory performance in a weakly supervised manner, which allows for significant time savings with respect to a fully supervised detector based on Faster R-CNN. To encourage research on the topic, we publicly release the code and the dataset at the following url: https://iplab.dmi.unict.it/WS_OBJ_DET/","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"78 1","pages":"263-274"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89632364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Underwater Image Enhancement Using Pre-trained Transformer 水下图像增强使用预训练变压器
Abderrahmene Boudiaf, Yu Guo, Adarsh Ghimire, N. Werghi, G. Masi, S. Javed, J. Dias
{"title":"Underwater Image Enhancement Using Pre-trained Transformer","authors":"Abderrahmene Boudiaf, Yu Guo, Adarsh Ghimire, N. Werghi, G. Masi, S. Javed, J. Dias","doi":"10.48550/arXiv.2204.04199","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04199","url":null,"abstract":"The goal of this work is to apply a denoising image transformer to remove the distortion from underwater images and compare it with other similar approaches. Automatic restoration of underwater images plays an important role since it allows to increase the quality of the images, without the need for more expensive equipment. This is a critical example of the important role of the machine learning algorithms to support marine exploration and monitoring, reducing the need for human intervention like the manual processing of the images, thus saving time, effort, and cost. This paper is the first application of the image transformer-based approach called\"Pre-Trained Image Processing Transformer\"to underwater images. This approach is tested on the UFO-120 dataset, containing 1500 images with the corresponding clean images.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"15 1","pages":"480-488"},"PeriodicalIF":0.0,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81933471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Engagement Detection with Multi-Task Training in E-Learning Environments 电子学习环境下多任务训练的敬业度检测
Onur Çopur, Mert Nakıp, Simone Scardapane, Jürgen Slowack
{"title":"Engagement Detection with Multi-Task Training in E-Learning Environments","authors":"Onur Çopur, Mert Nakıp, Simone Scardapane, Jürgen Slowack","doi":"10.48550/arXiv.2204.04020","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04020","url":null,"abstract":"Recognition of user interaction, in particular engagement detection, became highly crucial for online working and learning environments, especially during the COVID-19 outbreak. Such recognition and detection systems significantly improve the user experience and efficiency by providing valuable feedback. In this paper, we propose a novel Engagement Detection with Multi-Task Training (ED-MTT) system which minimizes mean squared error and triplet loss together to determine the engagement level of students in an e-learning environment. The performance of this system is evaluated and compared against the state-of-the-art on a publicly available dataset as well as videos collected from real-life scenarios. The results show that ED-MTT achieves 6 % lower MSE than the best state-of-the-art performance with highly acceptable training time and lightweight feature extraction. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"14 1 1","pages":"411-422"},"PeriodicalIF":0.0,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90668517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Online panoptic 3D reconstruction as a Linear Assignment Problem 作为线性分配问题的在线全视三维重建
Leevi Raivio, Esa Rahtu
{"title":"Online panoptic 3D reconstruction as a Linear Assignment Problem","authors":"Leevi Raivio, Esa Rahtu","doi":"10.48550/arXiv.2204.00231","DOIUrl":"https://doi.org/10.48550/arXiv.2204.00231","url":null,"abstract":". Real-time holistic scene understanding would allow machines to interpret their surrounding in a much more detailed manner than is currently possible. While panoptic image segmentation methods have brought image segmentation closer to this goal, this information has to be described relative to the 3D environment for the machine to be able to utilise it effectively. In this paper, we investigate methods for sequentially reconstructing static environments from panoptic image segmentations in 3D. We specifically target real-time operation: the algorithm must process data strictly online and be able to run at relatively fast frame rates. Additionally, the method should be scalable for environments large enough for practical applications. By applying a simple but powerful data-association algorithm, we outperform earlier similar works when operating purely online. Our method is also capable of reaching frame-rates high enough for real-time applications and is scalable to larger environments as well. Source code and further demonstrations are released to the public at: https://tutvision.github.io/Online-Panoptic-3D/","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"43 1","pages":"39-50"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81398832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medicinal Boxes Recognition on a Deep Transfer Learning Augmented Reality Mobile Application 基于深度迁移学习增强现实移动应用的药盒识别
D. Avola, L. Cinque, Alessio Fagioli, G. Foresti, Marco Raoul Marini, Alessio Mecca, D. Pannone
{"title":"Medicinal Boxes Recognition on a Deep Transfer Learning Augmented Reality Mobile Application","authors":"D. Avola, L. Cinque, Alessio Fagioli, G. Foresti, Marco Raoul Marini, Alessio Mecca, D. Pannone","doi":"10.48550/arXiv.2203.14031","DOIUrl":"https://doi.org/10.48550/arXiv.2203.14031","url":null,"abstract":"Taking medicines is a fundamental aspect to cure illnesses. However, studies have shown that it can be hard for patients to remember the correct posology. More aggravating, a wrong dosage generally causes the disease to worsen. Although, all relevant instructions for a medicine are summarized in the corresponding patient information leaflet, the latter is generally difficult to navigate and understand. To address this problem and help patients with their medication, in this paper we introduce an augmented reality mobile application that can present to the user important details on the framed medicine. In particular, the app implements an inference engine based on a deep neural network, i.e., a densenet, fine-tuned to recognize a medicinal from its package. Subsequently, relevant information, such as posology or a simplified leaflet, is overlaid on the camera feed to help a patient when taking a medicine. Extensive experiments to select the best hyperparameters were performed on a dataset specifically collected to address this task; ultimately obtaining up to 91.30% accuracy as well as real-time capabilities.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"75 1","pages":"489-499"},"PeriodicalIF":0.0,"publicationDate":"2022-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90784289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deepfake Style Transfer Mixture: a First Forensic Ballistics Study on Synthetic Images Deepfake风格转移混合物:合成图像的首次法医弹道研究
Luca Guarnera, O. Giudice, S. Battiato
{"title":"Deepfake Style Transfer Mixture: a First Forensic Ballistics Study on Synthetic Images","authors":"Luca Guarnera, O. Giudice, S. Battiato","doi":"10.1007/978-3-031-06430-2_13","DOIUrl":"https://doi.org/10.1007/978-3-031-06430-2_13","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"11 1","pages":"151-163"},"PeriodicalIF":0.0,"publicationDate":"2022-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81890036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信