{"title":"Special Session 5: Processing and Protection of Encrypted Multimedia Data","authors":"","doi":"10.1109/ipta54936.2022.9784117","DOIUrl":"https://doi.org/10.1109/ipta54936.2022.9784117","url":null,"abstract":"","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"2008 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125579928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Norman Díaz Estrada, Utkarsh Goyal, M. Ullah, F. A. Cheikh
{"title":"Ψ-NET: A Novel Encoder-Decoder Architecture for Animal Segmentation","authors":"David Norman Díaz Estrada, Utkarsh Goyal, M. Ullah, F. A. Cheikh","doi":"10.1109/IPTA54936.2022.9784135","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784135","url":null,"abstract":"This paper proposes a novel Ψ-Net architecture that consists of three encoders and a decoder for animal image segmentation. The main characteristic of our proposed architecture is that the outputs at each depth level of the three encoders are summed up and then concatenated in the corresponding depth levels of the decoder for the upsampling process. We col-lected a new dataset consisting of 200 images for training the model, and we manually labelled the ground truth segmentation masks for these images. We trained our proposed model Ψ-Net on this dataset and compared the segmentation accu-racy with the classical U-Net and Y-Net architectures. Our proposed model achieved the highest accuracy on the dataset with 93% pixel accuracy, and 81.6% mean intersection-over-union (IoU).","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121928437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Justus Schock, Yu-Chia Lan, D. Truhn, M. Kopaczka, Stefan Conrad, S. Nebelung, D. Merhof
{"title":"Monoplanar CT Reconstruction with GANs","authors":"Justus Schock, Yu-Chia Lan, D. Truhn, M. Kopaczka, Stefan Conrad, S. Nebelung, D. Merhof","doi":"10.1109/IPTA54936.2022.9784126","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784126","url":null,"abstract":"Reconstructing Computed Tomography images (CT) from radiographs currently requires biplanar radiographs for accurate CT reconstruction due to the complementary information contained in the individual views. However, in many cases biplanar information is not available. In this work, we therefore propose a KNN and a PCA-based approach using biplanar radiographs only at the training stage while performing the final inference using only a single anterior-posterior radiograph, thereby increasing the applicability and usability of the model. The methods are quantitatively validated on a multiview database achieving 81% PSNR of biplanar inference and also qualitatively on a dataset of radiographs with no corresponding CT scans.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124064473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weak supervision using cell tracking annotation and image registration improves cell segmentation","authors":"N. A. Anoshina, D. Sorokin","doi":"10.1109/IPTA54936.2022.9784140","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784140","url":null,"abstract":"Learning-based cell segmentation methods have proved to be very effective in cell tracking. The main difficulty of using machine learning is the lack of expert annotation of biomedical data. We propose a weakly-supervised approach that extends the amount of segmentation training data for image sequences where only a couple of frames are annotated. The approach uses the tracking annotations as weak labels and image registration to extend the segmentation annotation to the neighbouring frames. This technique was applied to cell segmentation step in the cell tracking problem. An experimental comparison of the baseline segmentation network trained on the data with pure GT annotation and the same segmentation network trained on the GT data and additional annotations generated with the proposed approach has been performed. The proposed weakly-supervised approach increased the IoU and SEG metrics on the data from the Cell Tracking Challenge.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134315837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pyramid Tokens-to-Token Vision Transformer for Thyroid Pathology Image Classification","authors":"Peng Yin, Bo Yu, Cheng-wei Jiang, Hechang Chen","doi":"10.1109/IPTA54936.2022.9784139","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784139","url":null,"abstract":"Histopathological image contains rich phenotypic information, which is beneficial to classifying tumor subtypes and predicting the development of diseases. The vast size of pathological slides makes it impossible to directly train whole slide images (WSI) on convolutional neural networks (CNNs). Most of the previous weakly supervision works divide high-resolution WSIs into small image patches and separately input them into the CNN to classify them as tumors or normal areas. The first difficulty is that although the method based on the CNN framework achieves a high accuracy rate, it increases the model parameters and computational complexity. The second difficulty is balancing the relationship between accuracy and model compu-tation. It makes the model maintain and improve the classification accuracy as much as possible based on the lightweight. In this paper, we propose a new lightweight architecture called Pyramid Tokens-to-Token VIsion Transformer (PyT2T-ViT) with multiple instance learning based on Vision Transformer. We introduce the feature extractor of the model with Token-to-Token ViT (T2T-ViT) to reduce the model parameters. The performance of the model is improved by combining the image pyramid of multiple receptive fields so that it can take into account the local and global features of the cell structure at a single scale. We applied the method to our collection of 560 thyroid pathology images from the same institution, model parameters and computation were greatly reduced. The classification effect is significantly better than the CNN-based method.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115703585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathanael L. Baisa, Zheheng Jiang, Ritesh Vyas, Bryan Williams, Hossein Rahmani, P. Angelov, Sue Black
{"title":"Hand-Based Person Identification using Global and Part-Aware Deep Feature Representation Learning","authors":"Nathanael L. Baisa, Zheheng Jiang, Ritesh Vyas, Bryan Williams, Hossein Rahmani, P. Angelov, Sue Black","doi":"10.1109/IPTA54936.2022.9784133","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784133","url":null,"abstract":"In cases of serious crime, including sexual abuse, often the only available information with demonstrated potential for identification is images of the hands. Since this evidence is captured in uncontrolled situations, it is difficult to analyse. As global approaches to feature comparison are limited in this case, it is important to extend to consider local information. In this work, we propose hand-based person identification by learning both global and local deep feature representations. Our proposed method, Global and Part-Aware Network (GPA-Net), creates global and local branches on the conv-layer for learning robust discriminative global and part-level features. For learning the local (part-level) features, we perform uniform partitioning on the conv-layer in both horizontal and vertical directions. We retrieve the parts by conducting a soft partition without explicitly partitioning the images or requiring external cues such as pose estimation. We make extensive evaluations on two large multi-ethnic and publicly available hand datasets, demonstrating that our proposed method significantly outperforms competing approaches.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134023170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}