2021 Digital Image Computing: Techniques and Applications (DICTA)最新文献

筛选
英文 中文
Streaming Multi-layer Ensemble Selection using Dynamic Genetic Algorithm 基于动态遗传算法的流多层集成选择
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647220
Anh Vu Luong, T. Nguyen, Alan Wee-Chung Liew
{"title":"Streaming Multi-layer Ensemble Selection using Dynamic Genetic Algorithm","authors":"Anh Vu Luong, T. Nguyen, Alan Wee-Chung Liew","doi":"10.1109/DICTA52665.2021.9647220","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647220","url":null,"abstract":"In this study, we introduce a novel framework for non-stationary data stream classification problems by modifying the Genetic Algorithm to search for the optimal configuration of a streaming multi-layer ensemble. We aim to connect the two sub-fields of non-stationary stream classification and evolutionary dynamic optimization. First, we present Streaming Multi-layer Ensemble (SMiLE) - a novel classification algorithm for nonstationary data streams which comprises multiple layers of different classifiers. Second, we develop an ensemble selection method to obtain an optimal subset of classifiers for each layer of SMiLE. We formulate the selection process as a dynamic optimization problem and then solve it by adapting the Genetic Algorithm to the stream setting, generating a new classification framework called SMiLE_GA. Finally, we apply the proposed framework to address a real-world problem of insect stream classification, which relates to the automatic recognition of insects through optical sensors in real-time. The experiments showed that the proposed method achieves better prediction accuracy than several state-of-the-art benchmark algorithms for non-stationary data stream classification.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"26 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133384503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
3D Morphable Ear Model: A Complete Pipeline from Ear Segmentation to Statistical Modeling 三维变形耳模型:从耳分割到统计建模的完整流水线
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647339
M. Mursalin, S. Islam, S. Z. Gilani
{"title":"3D Morphable Ear Model: A Complete Pipeline from Ear Segmentation to Statistical Modeling","authors":"M. Mursalin, S. Islam, S. Z. Gilani","doi":"10.1109/DICTA52665.2021.9647339","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647339","url":null,"abstract":"The shape of human ear contains crucial information that can be used for biometric identification. Analysis of the ear shape can be improved by using a statistical shape model known as 3D Morphable Ear Model (3DMEM). In this work, we propose a complete pipeline to create the 3DMEM by following a three-step procedure. First, a large ear database is created by segmenting ears from 3D profile faces using a deep convolutional neural network. Next, dense correspondence between 3D ears is established using Generalized Procrustes Analysis (GPA). Finally, the 3DMEM is constructed using Principal Component Analysis (PCA). Our results show that 3DMEM can generalize well on unseen 3D ear data.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116144450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Indoor Semantic Scene Understanding Using 2D-3D Fusion 使用2D-3D融合的室内语义场景理解
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647182
Muraleekrishna Gopinathan, Giang Truong, Jumana Abu-Khalaf
{"title":"Indoor Semantic Scene Understanding Using 2D-3D Fusion","authors":"Muraleekrishna Gopinathan, Giang Truong, Jumana Abu-Khalaf","doi":"10.1109/DICTA52665.2021.9647182","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647182","url":null,"abstract":"Seamless Human-Robot Interaction is the ultimate goal of developing service robotic systems. For this, the robotic agents have to understand their surroundings to better complete a given task. Semantic scene understanding allows a robotic agent to extract semantic knowledge about the objects in the environment. In this work, we present a semantic scene understanding pipeline that fuses 2D and 3D detection branches to generate a semantic map of the environment. The 2D mask proposals from state-of-the-art 2D detectors are inverse-projected to the 3D space and combined with 3D detections from point segmentation networks. Unlike previous works that were evaluated on collected datasets, we test our pipeline on an active photo-realistic robotic environment BenchBot. Our novelty includes the rectification of 3D proposals using projected 2D detections and modality fusion based on object size. This work is done as part of the Robotic Vision Scene Understanding Challenge (RVSU). The performance evaluation demonstrates that our pipeline has improved on baseline methods without significant computational bottleneck.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122394205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AUTOMATIC SHEEP BEHAVIOUR ANALYSIS USING MASK R-CNN 利用掩模r-cnn自动分析羊的行为
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647101
Jingsong Xu, Qiang Wu, Jian Zhang, Amy Tait
{"title":"AUTOMATIC SHEEP BEHAVIOUR ANALYSIS USING MASK R-CNN","authors":"Jingsong Xu, Qiang Wu, Jian Zhang, Amy Tait","doi":"10.1109/DICTA52665.2021.9647101","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647101","url":null,"abstract":"The issue of sheep welfare during live exports has triggered a lot of public concern recently. Extensive research is being carried out to monitor and improve animal welfare. Stocking density can be a critical factor affecting sheep welfare during export and its impact can be monitored through sheep behaviour, position, group dynamics and physiology. In this paper we demonstrate the application of the instance segmentation method Mask R-CNN to support sheep behaviour recognition. As an initial step, two typical behaviours standing and lying are recognized under different group sizes in pens over time. 94%+ mAP was achieved in the validation set demonstrating the effectiveness of the method on identifying sheep behaviours. Further data analysis will provide available space requirements for additional sheep allocation and daily behaviour monitoring to detect abnormal cases which will aim to improve the health and wellbeing of sheep on ships.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127149948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Texture enhanced Statistical Region Merging with application to automatic knee bones segmentation from CT 纹理增强统计区域合并在CT膝关节自动分割中的应用
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647224
Michael Howes, M. Bajger, Gobert N. Lee, Francesca Bucci, S. Martelli
{"title":"Texture enhanced Statistical Region Merging with application to automatic knee bones segmentation from CT","authors":"Michael Howes, M. Bajger, Gobert N. Lee, Francesca Bucci, S. Martelli","doi":"10.1109/DICTA52665.2021.9647224","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647224","url":null,"abstract":"Statistical Region Merging technique belongs to the portfolio of very successful image segmentation methods across diverse domains and applications. The method is based on a solid probabilistic principle and was extended in various directions to suit specific applications, including those from medical domains. In its basic implementation the technique is based on a merging criterion relying on image pixel intensities. Sufficient to segment well some natural scene images, it often deteriorates dramatically when challenging medical images are segmented. In this study we introduce a new merging criterion into the method which utilizes texture characteristic of the image. We demonstrate that the enhanced criterion allows segmentation of knee bones in CT comparable to state-of-the-art outcomes found in literature while preserving the desirable properties of the original technique.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130498624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Semantic Attribute Enriched Storytelling from a Sequence of Images 语义属性丰富的图像序列叙事
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647213
Zainy M. Malakan, G. Hassan, M. Jalwana, Nayyer Aafaq, A. Mian
{"title":"Semantic Attribute Enriched Storytelling from a Sequence of Images","authors":"Zainy M. Malakan, G. Hassan, M. Jalwana, Nayyer Aafaq, A. Mian","doi":"10.1109/DICTA52665.2021.9647213","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647213","url":null,"abstract":"Visual storytelling (VST) pertains to the task of generating story-based sentences from an ordered sequence of images. Contemporary techniques suffer from several limitations such as inadequate encapsulation of visual variance and context capturing among the input sequence. Consequently, generated story from such techniques often lacks coherence, context and semantic information. In this research, we devise a ‘Semantic Attribute Enriched Storytelling’ (SAES) framework to mitigate these issues. To that end, we first extract the visual features of input image sequence and the noun entities present in the visual input by employing an off-the-shelf object detector. The two features are concatenated to encapsulate the visual variance of the input sequence. The features are then passed through a Bidirectional-LSTM sequence encoder to capture the past and future context of the input image sequence followed by attention mechanism to enhance the discriminality of the input to language model i.e., mogrifier-LSTM. Additionally, we incorporate semantic attributes e.g., nouns to complement the semantic context in the generated story. Detailed experimental and human evaluations are performed to establish competitive performance of proposed technique. We achieve up 1.4% improvement on BLEU metric over the recent state-of-art methods.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132869948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Edge Aware Commonality Modeling based Reference Frame for 360 Degree Video Coding 基于边缘感知共性建模的360度视频编码参考帧
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647051
Ashek Ahmmed, M. Pickering, A. Lambert, M. Paul
{"title":"Edge Aware Commonality Modeling based Reference Frame for 360 Degree Video Coding","authors":"Ashek Ahmmed, M. Pickering, A. Lambert, M. Paul","doi":"10.1109/DICTA52665.2021.9647051","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647051","url":null,"abstract":"Video coding algorithms tries to model the significant commonality that exists within a video sequence. This role is even more critical in 360-degree video sequences given the associated enormous amount of data that need be stored and communicated. Moreover, in such sequences the captured video signal from omnidirectional cameras are projected onto a plane; hence they exhibit different characteristics compared to traditional video frames. Therefore the conventional block-based and translational motion modeling approach employed by modern video coding standards, such as HEVC, may not provide an efficient compression of 360-degree video data. The edge position difference (EPD) measure based motion modeling (EPD-MM) has shown good motion compensation capabilities for traditional video sequences. The EPD-MM technique is underpinned by the fact that from one frame to the next one, edges map to edges and such mapping can be captured by an appropriate motion model. Since the 360-degree frames contain significant edge information, in this paper, for motion compensation an edge aware commonality modeling technique is adopted. In particular, the EPD-MM based motion compensated prediction of the current 360-degree frame is generated using the already coded reference frame. Experimental results show that if this predicted frame is used as an additional reference frame, bit rate savings can be obtained over an anchor HEVC encoder.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131783676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Seagrass Detection from Underwater Digital Images using Faster R-CNN with NASNet 基于更快R-CNN和NASNet的水下数字图像海草检测
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647325
Md Kislu Noman, S. Islam, Jumana Abu-Khalaf, P. Lavery
{"title":"Seagrass Detection from Underwater Digital Images using Faster R-CNN with NASNet","authors":"Md Kislu Noman, S. Islam, Jumana Abu-Khalaf, P. Lavery","doi":"10.1109/DICTA52665.2021.9647325","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647325","url":null,"abstract":"In recent years, it has been demonstrated that deep learning has great success in a variety of computer vision applications. Deep learning-based Faster R-CNN algorithm depends on region proposal network that provides state-of-the-art object detection performance. To date, a limited number of Faster R-CNN approaches have been attempted to detect seagrass from underwater digital images. This paper proposes an improved seagrass detector that enhances the detection performance by combining the Faster R-CNN framework with the NASNet-A backbone. This seagrass detector achieves a high mean average precision (mAP) of 0.412 on ECUHO-2 dataset, which is significantly better than state-of-the-art Halophila ovalis detection performance on this dataset.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122415437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Protecting Deep Cerebrospinal Fluid Cell Image Processing Models with Backdoor and Semi-Distillation 保护深部脑脊液细胞的后门和半蒸馏图像处理模型
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647115
Fangqi Li, Shilin Wang, Zhenhua Wang
{"title":"Protecting Deep Cerebrospinal Fluid Cell Image Processing Models with Backdoor and Semi-Distillation","authors":"Fangqi Li, Shilin Wang, Zhenhua Wang","doi":"10.1109/DICTA52665.2021.9647115","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647115","url":null,"abstract":"Cerebrospinal fluid image is an informative source for the diagnosis of many diseases. Consequently, deep learning models for cerebrospinal fluid image processing turn out to be a promising computer-aided diagnosis technique. Current models can efficiently and correctly identify numerous categories of cells within an image of cerebrospinal fluid. Training a cerebrospinal fluid image processing model, especially a deep neural network, requires a vast amount of data and computation. Collecting necessary data for medical tasks is an expensive procedure, during which many experts, devices, and privacy concerns are involved. Therefore, it is crucial to protect these deep models from piracy and reselling. In this paper, we study the problem of intellectual property protection of deep cerebrospinal fluid image processing models. We adopt the backdoor-based watermark as the ownership evidence and propose a semi-distillation framework to embed the watermark into the model. The proposed scheme can verify the ownership of the genuine author, hence provide robust and unforgeable protection over deep cerebrospinal fluid image processing models.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122687216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Incremental Learning of Object Detector with Limited Training Data 有限训练数据下目标检测器的增量学习
2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-11-01 DOI: 10.1109/DICTA52665.2021.9647245
Muhammad Abdullah Hafeez, A. Ul-Hasan, F. Shafait
{"title":"Incremental Learning of Object Detector with Limited Training Data","authors":"Muhammad Abdullah Hafeez, A. Ul-Hasan, F. Shafait","doi":"10.1109/DICTA52665.2021.9647245","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647245","url":null,"abstract":"State of the art Deep learning models, despite being at par to the human level in some of the challenging tasks, still suffer badly when they are put in the condition where they have to learn with time. This open challenge problem of making deep learning model learn with time is referred in the literature as Lifelong Learning, Incremental Learning or Continual Learning. In each increment, new classes/tasks are introduced to the existing model and trained on them while maintaining the accuracy of the previously learned classes/tasks. But accuracy of the deep learning model on the previously learned classes/tasks decreases with each increment. The main reason behind this accuracy drop is catastrophic forgetting, an inherent flaw in the deep learning models, where weights learned during the past increments, get disturbed while learning the new classes/tasks from new increment. Several approaches have been proposed to mitigate or avoid this catastrophic forgetting, such as the use of knowledge distillation, rehearsal over previous classes, or dedicated paths for different increments, etc. In this work, we have proposed a novel approach based on transfer learning methodology, which uses a combination of pre-trained shared and fixed network as a backbone, along with a dedicated network extension in incremental setting for the learning of new tasks incrementally. The results have shown that our approach has better performance in two ways. First, our model has significantly better overall incremental accuracy than that of the best in class model in different incremental configurations. Second, our approach achieves better results while maintaining properties of true incremental learning algorithm i.e. successful avoidance of the catastrophic forgetting issue and complete eradication of the need of saved exemplars or retraining phases, which are required by the current state of the art model to maintain performance.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117320065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信