2021 Digital Image Computing: Techniques and Applications (DICTA)最新文献_第3页

Streaming Multi-layer Ensemble Selection using Dynamic Genetic Algorithm 基于动态遗传算法的流多层集成选择

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647220

Anh Vu Luong, T. Nguyen, Alan Wee-Chung Liew

{"title":"Streaming Multi-layer Ensemble Selection using Dynamic Genetic Algorithm","authors":"Anh Vu Luong, T. Nguyen, Alan Wee-Chung Liew","doi":"10.1109/DICTA52665.2021.9647220","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647220","url":null,"abstract":"In this study, we introduce a novel framework for non-stationary data stream classification problems by modifying the Genetic Algorithm to search for the optimal configuration of a streaming multi-layer ensemble. We aim to connect the two sub-fields of non-stationary stream classification and evolutionary dynamic optimization. First, we present Streaming Multi-layer Ensemble (SMiLE) - a novel classification algorithm for nonstationary data streams which comprises multiple layers of different classifiers. Second, we develop an ensemble selection method to obtain an optimal subset of classifiers for each layer of SMiLE. We formulate the selection process as a dynamic optimization problem and then solve it by adapting the Genetic Algorithm to the stream setting, generating a new classification framework called SMiLE_GA. Finally, we apply the proposed framework to address a real-world problem of insect stream classification, which relates to the automatic recognition of insects through optical sensors in real-time. The experiments showed that the proposed method achieves better prediction accuracy than several state-of-the-art benchmark algorithms for non-stationary data stream classification.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"26 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133384503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

3D Morphable Ear Model: A Complete Pipeline from Ear Segmentation to Statistical Modeling 三维变形耳模型:从耳分割到统计建模的完整流水线

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647339

M. Mursalin, S. Islam, S. Z. Gilani

引用次数: 1

Indoor Semantic Scene Understanding Using 2D-3D Fusion 使用2D-3D融合的室内语义场景理解

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647182

Muraleekrishna Gopinathan, Giang Truong, Jumana Abu-Khalaf

{"title":"Indoor Semantic Scene Understanding Using 2D-3D Fusion","authors":"Muraleekrishna Gopinathan, Giang Truong, Jumana Abu-Khalaf","doi":"10.1109/DICTA52665.2021.9647182","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647182","url":null,"abstract":"Seamless Human-Robot Interaction is the ultimate goal of developing service robotic systems. For this, the robotic agents have to understand their surroundings to better complete a given task. Semantic scene understanding allows a robotic agent to extract semantic knowledge about the objects in the environment. In this work, we present a semantic scene understanding pipeline that fuses 2D and 3D detection branches to generate a semantic map of the environment. The 2D mask proposals from state-of-the-art 2D detectors are inverse-projected to the 3D space and combined with 3D detections from point segmentation networks. Unlike previous works that were evaluated on collected datasets, we test our pipeline on an active photo-realistic robotic environment BenchBot. Our novelty includes the rectification of 3D proposals using projected 2D detections and modality fusion based on object size. This work is done as part of the Robotic Vision Scene Understanding Challenge (RVSU). The performance evaluation demonstrates that our pipeline has improved on baseline methods without significant computational bottleneck.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122394205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AUTOMATIC SHEEP BEHAVIOUR ANALYSIS USING MASK R-CNN 利用掩模r-cnn自动分析羊的行为

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647101

Jingsong Xu, Qiang Wu, Jian Zhang, Amy Tait

引用次数: 7

Texture enhanced Statistical Region Merging with application to automatic knee bones segmentation from CT 纹理增强统计区域合并在CT膝关节自动分割中的应用

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647224

Michael Howes, M. Bajger, Gobert N. Lee, Francesca Bucci, S. Martelli

引用次数: 2

Semantic Attribute Enriched Storytelling from a Sequence of Images 语义属性丰富的图像序列叙事

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647213

Zainy M. Malakan, G. Hassan, M. Jalwana, Nayyer Aafaq, A. Mian

{"title":"Semantic Attribute Enriched Storytelling from a Sequence of Images","authors":"Zainy M. Malakan, G. Hassan, M. Jalwana, Nayyer Aafaq, A. Mian","doi":"10.1109/DICTA52665.2021.9647213","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647213","url":null,"abstract":"Visual storytelling (VST) pertains to the task of generating story-based sentences from an ordered sequence of images. Contemporary techniques suffer from several limitations such as inadequate encapsulation of visual variance and context capturing among the input sequence. Consequently, generated story from such techniques often lacks coherence, context and semantic information. In this research, we devise a ‘Semantic Attribute Enriched Storytelling’ (SAES) framework to mitigate these issues. To that end, we first extract the visual features of input image sequence and the noun entities present in the visual input by employing an off-the-shelf object detector. The two features are concatenated to encapsulate the visual variance of the input sequence. The features are then passed through a Bidirectional-LSTM sequence encoder to capture the past and future context of the input image sequence followed by attention mechanism to enhance the discriminality of the input to language model i.e., mogrifier-LSTM. Additionally, we incorporate semantic attributes e.g., nouns to complement the semantic context in the generated story. Detailed experimental and human evaluations are performed to establish competitive performance of proposed technique. We achieve up 1.4% improvement on BLEU metric over the recent state-of-art methods.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132869948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Edge Aware Commonality Modeling based Reference Frame for 360 Degree Video Coding 基于边缘感知共性建模的360度视频编码参考帧

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647051

Ashek Ahmmed, M. Pickering, A. Lambert, M. Paul

{"title":"Edge Aware Commonality Modeling based Reference Frame for 360 Degree Video Coding","authors":"Ashek Ahmmed, M. Pickering, A. Lambert, M. Paul","doi":"10.1109/DICTA52665.2021.9647051","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647051","url":null,"abstract":"Video coding algorithms tries to model the significant commonality that exists within a video sequence. This role is even more critical in 360-degree video sequences given the associated enormous amount of data that need be stored and communicated. Moreover, in such sequences the captured video signal from omnidirectional cameras are projected onto a plane; hence they exhibit different characteristics compared to traditional video frames. Therefore the conventional block-based and translational motion modeling approach employed by modern video coding standards, such as HEVC, may not provide an efficient compression of 360-degree video data. The edge position difference (EPD) measure based motion modeling (EPD-MM) has shown good motion compensation capabilities for traditional video sequences. The EPD-MM technique is underpinned by the fact that from one frame to the next one, edges map to edges and such mapping can be captured by an appropriate motion model. Since the 360-degree frames contain significant edge information, in this paper, for motion compensation an edge aware commonality modeling technique is adopted. In particular, the EPD-MM based motion compensated prediction of the current 360-degree frame is generated using the already coded reference frame. Experimental results show that if this predicted frame is used as an additional reference frame, bit rate savings can be obtained over an anchor HEVC encoder.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131783676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Seagrass Detection from Underwater Digital Images using Faster R-CNN with NASNet 基于更快R-CNN和NASNet的水下数字图像海草检测

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647325

Md Kislu Noman, S. Islam, Jumana Abu-Khalaf, P. Lavery

引用次数: 5

Protecting Deep Cerebrospinal Fluid Cell Image Processing Models with Backdoor and Semi-Distillation 保护深部脑脊液细胞的后门和半蒸馏图像处理模型

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647115

Fangqi Li, Shilin Wang, Zhenhua Wang

{"title":"Protecting Deep Cerebrospinal Fluid Cell Image Processing Models with Backdoor and Semi-Distillation","authors":"Fangqi Li, Shilin Wang, Zhenhua Wang","doi":"10.1109/DICTA52665.2021.9647115","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647115","url":null,"abstract":"Cerebrospinal fluid image is an informative source for the diagnosis of many diseases. Consequently, deep learning models for cerebrospinal fluid image processing turn out to be a promising computer-aided diagnosis technique. Current models can efficiently and correctly identify numerous categories of cells within an image of cerebrospinal fluid. Training a cerebrospinal fluid image processing model, especially a deep neural network, requires a vast amount of data and computation. Collecting necessary data for medical tasks is an expensive procedure, during which many experts, devices, and privacy concerns are involved. Therefore, it is crucial to protect these deep models from piracy and reselling. In this paper, we study the problem of intellectual property protection of deep cerebrospinal fluid image processing models. We adopt the backdoor-based watermark as the ownership evidence and propose a semi-distillation framework to embed the watermark into the model. The proposed scheme can verify the ownership of the genuine author, hence provide robust and unforgeable protection over deep cerebrospinal fluid image processing models.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122687216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Incremental Learning of Object Detector with Limited Training Data 有限训练数据下目标检测器的增量学习

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647245

Muhammad Abdullah Hafeez, A. Ul-Hasan, F. Shafait

{"title":"Incremental Learning of Object Detector with Limited Training Data","authors":"Muhammad Abdullah Hafeez, A. Ul-Hasan, F. Shafait","doi":"10.1109/DICTA52665.2021.9647245","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647245","url":null,"abstract":"State of the art Deep learning models, despite being at par to the human level in some of the challenging tasks, still suffer badly when they are put in the condition where they have to learn with time. This open challenge problem of making deep learning model learn with time is referred in the literature as Lifelong Learning, Incremental Learning or Continual Learning. In each increment, new classes/tasks are introduced to the existing model and trained on them while maintaining the accuracy of the previously learned classes/tasks. But accuracy of the deep learning model on the previously learned classes/tasks decreases with each increment. The main reason behind this accuracy drop is catastrophic forgetting, an inherent flaw in the deep learning models, where weights learned during the past increments, get disturbed while learning the new classes/tasks from new increment. Several approaches have been proposed to mitigate or avoid this catastrophic forgetting, such as the use of knowledge distillation, rehearsal over previous classes, or dedicated paths for different increments, etc. In this work, we have proposed a novel approach based on transfer learning methodology, which uses a combination of pre-trained shared and fixed network as a backbone, along with a dedicated network extension in incremental setting for the learning of new tasks incrementally. The results have shown that our approach has better performance in two ways. First, our model has significantly better overall incremental accuracy than that of the best in class model in different incremental configurations. Second, our approach achieves better results while maintaining properties of true incremental learning algorithm i.e. successful avoidance of the catastrophic forgetting issue and complete eradication of the need of saved exemplars or retraining phases, which are required by the current state of the art model to maintain performance.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117320065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0