{"title":"Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection","authors":"Jinlei Zhu, Houjin Chen, Pan Pan, Jia Sun","doi":"10.1186/s13640-022-00588-4","DOIUrl":"https://doi.org/10.1186/s13640-022-00588-4","url":null,"abstract":"<p>This study proposes a novel network model for video action tube detection. This model is based on a location-interactive weakly supervised spatial–temporal attention mechanism driven by multiple loss functions. It is especially costly and time consuming to annotate every target location in video frames. Thus, we first propose a cross-domain weakly supervised learning method with a spatial–temporal attention mechanism for action tube detection. In source domain, we trained a newly designed multi-loss spatial–temporal attention–convolution network on the source data set, which has both object location and classification annotations. In target domain, we introduced internal tracking loss and neighbor-consistency loss; we trained the network with the pre-trained model on the target data set, which only has inaccurate action temporal positions. Although this is a location-unsupervised method, its performance outperforms typical weakly supervised methods, and even shows comparable results with some recent fully supervised methods. We also visualize the activation maps, which reveal the intrinsic reason behind the higher performance of the proposed method.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"14 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of different DCNN models in remote sensing image object detection","authors":"Hua Liu, Jixiang Du, Yong Zhang, Hongbo Zhang","doi":"10.1186/s13640-022-00586-6","DOIUrl":"https://doi.org/10.1186/s13640-022-00586-6","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"2022 1","pages":"1-18"},"PeriodicalIF":2.4,"publicationDate":"2022-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45878916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-orientation local ternary pattern-based feature extraction for forensic dentistry","authors":"Karunya Rajmohan, Askarunisa Abdul Khader","doi":"10.1186/s13640-022-00584-8","DOIUrl":"https://doi.org/10.1186/s13640-022-00584-8","url":null,"abstract":"<p>Accurate and automated identification of the deceased victims with dental radiographs plays a significant role in forensic dentistry. The image processing techniques such as segmentation and feature extraction play a crucial role in image retrieval in accordance with the matching image. The raw image undergoes segmentation, feature extraction and distance-based image retrieval. The ultimate goal of the proposed work is the automated quality enhancement of the image by providing advanced enhancement techniques, segmentation techniques, feature extraction, and matching techniques. In this paper, multi-orientation local ternary pattern-based feature extraction is proposed for feature extraction. The grey level difference method (GLDM) is adopted to extract the texture and shape features that are considered for better results. The image retrieval is done by the computation of similarity score using distances such as Manhattan, Euclidean, vector cosine angle, and histogram intersection distance to obtain the optimal match from the database. The manually picked dataset of 200 images is considered for performance analysis. By extracting both the shape features and texture features, the proposed approach achieved maximum accuracy, precision, recall, F-measure, sensitivity, and specificity and lower false-positive and negative values.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"21 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face image synthesis from facial parts","authors":"Qiushi Sun, Jingtao Guo, Yi Liu","doi":"10.1186/s13640-022-00585-7","DOIUrl":"https://doi.org/10.1186/s13640-022-00585-7","url":null,"abstract":"<p>Recently, inspired by the growing power of deep convolutional neural networks (CNNs) and generative adversarial networks (GANs), facial image editing has received increasing attention and has produced a series of wide-ranging applications. In this paper, we propose a new and effective approach to a challenging task: synthesizing face images based on key facial parts. The proposed approach is a novel deep generative network that can automatically align facial parts with the precise positions in a face image and then output an entire facial image conditioned on the well-aligned parts. Specifically, three loss functions are introduced in this approach, which are the key to making the synthesized realistic facial image: a reconstruction loss to generate image content in an unknown region, a perceptual loss to enhance the network's ability to model high-level semantic structures and an adversarial loss to ensure that the synthesized images are visually realistic. In this approach, the three components cooperate well to form an effective framework for parts-based high-quality facial image synthesis. Finally, extensive experiments demonstrate the superior performance of this method to existing solutions.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"27 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An image-guided network for depth edge enhancement","authors":"Kuan-Ting Lee, Enyu Liu, J. Yang, Li Hong","doi":"10.1186/s13640-022-00583-9","DOIUrl":"https://doi.org/10.1186/s13640-022-00583-9","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49354185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parin Kittipongdaja, Thitirat Siriborvornratanakul
{"title":"Automatic kidney segmentation using 2.5D ResUNet and 2.5D DenseUNet for malignant potential analysis in complex renal cyst based on CT images","authors":"Parin Kittipongdaja, Thitirat Siriborvornratanakul","doi":"10.1186/s13640-022-00581-x","DOIUrl":"https://doi.org/10.1186/s13640-022-00581-x","url":null,"abstract":"<p>Bosniak renal cyst classification has been widely used in determining the complexity of a renal cyst. However, it turns out that about half of patients undergoing surgery for Bosniak category III, take surgical risks that reward them with no clinical benefit at all. This is because their pathological results reveal that the cysts are actually benign not malignant. This problem inspires us to use recently popular deep learning techniques and study alternative analytics methods for precise binary classification (benign or malignant tumor) on Computerized Tomography (CT) images. To achieve our goal, two consecutive steps are required–segmenting kidney organs or lesions from CT images then classifying the segmented kidneys. In this paper, we propose a study of kidney segmentation using 2.5D ResUNet and 2.5D DenseUNet for efficiently extracting intra-slice and inter-slice features. Our models are trained and validated on the public data set from Kidney Tumor Segmentation (KiTS19) challenge in two different training environments. As a result, all experimental models achieve high mean kidney Dice scores of at least 95% on the KiTS19 validation set consisting of 60 patients. Apart from the KiTS19 data set, we also conduct separate experiments on abdomen CT images of four Thai patients. Based on the four Thai patients, our experimental models show a drop in performance, where the best mean kidney Dice score is 87.60%.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"21 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianming Zhang, Hehua Liu, Yaoqi He, Li-Dan Kuang, Xi Chen
{"title":"Adaptive response maps fusion of correlation filters with anti-occlusion mechanism for visual object tracking","authors":"Jianming Zhang, Hehua Liu, Yaoqi He, Li-Dan Kuang, Xi Chen","doi":"10.1186/s13640-022-00582-w","DOIUrl":"https://doi.org/10.1186/s13640-022-00582-w","url":null,"abstract":"<p>Despite the impressive performance of correlation filter-based trackers in terms of robustness and accuracy, the trackers have room for improvement. The majority of existing trackers use a single feature or fixed fusion weights, which makes it possible for tracking to fail in the case of deformation or severe occlusion. In this paper, we propose a multi-feature response map adaptive fusion strategy based on the consistency of individual features and fused feature. It is able to improve the robustness and accuracy by building the better object appearance model. Moreover, since the response map has multiple local peaks when the target is occluded, we propose an anti-occlusion mechanism. Specifically, if the nonmaximal local peak is satisfied with our proposed conditions, we generate a new response map which is obtained by moving the center of the region of interest to the nonmaximal local peak position of the response map and re-extracting features. We then select the response map with the largest response value as the final response map. This proposed anti-occlusion mechanism can effectively cope with the problem of tracking failure caused by occlusion. Finally, by adjusting the learning rate in different scenes, we designed a high-confidence model update strategy to deal with the problem of model pollution. Besides, we conducted experiments on OTB2013, OTB2015, TC128 and UAV123 datasets and compared them with the current state-of-the-art algorithms, and the proposed algorithms have impressive advantages in terms of accuracy and robustness.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"12 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Świderski, S. Osowski, Grzegorz Gwardys, J. Kurek, M. Słowińska, I. Lugowska
{"title":"Random CNN structure: tool to increase generalization ability in deep learning","authors":"B. Świderski, S. Osowski, Grzegorz Gwardys, J. Kurek, M. Słowińska, I. Lugowska","doi":"10.1186/s13640-022-00580-y","DOIUrl":"https://doi.org/10.1186/s13640-022-00580-y","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49608624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Printing and scanning investigation for image counter forensics","authors":"Hailey James, O. Gupta, D. Raviv","doi":"10.1186/s13640-023-00610-3","DOIUrl":"https://doi.org/10.1186/s13640-023-00610-3","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42053190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Printing and scanning investigation for image counter forensics","authors":"Hailey Joren, O. Gupta, D. Raviv","doi":"10.1186/s13640-022-00579-5","DOIUrl":"https://doi.org/10.1186/s13640-022-00579-5","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"2022 1","pages":"1-15"},"PeriodicalIF":2.4,"publicationDate":"2022-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45050831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}