2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)最新文献

Quality Classification and Segmentation of Sugarcane Billets Using Machine Vision 基于机器视觉的甘蔗坯质量分类与分割

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034561

{"title":"Quality Classification and Segmentation of Sugarcane Billets Using Machine Vision","authors":"","doi":"10.1109/DICTA56598.2022.10034561","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034561","url":null,"abstract":"Machine learning is widely used in agriculture to optimize practices such as planting, crop detection, and harvesting. The sugar industry is a major contributor to the global economy, valuable both as a food source and as a sustainable crop with useful byproducts. This paper presents three machine vision algorithms capable of performing quality classification and segmentation of raw sugarcane billets, developing a proof-of-concept for implementation at our industry partner's mill in NSW. Such a system has the potential to improve quality and reduce costs associated with an essential yet labor-intensive, inefficient, and unreliable process. Two recent iterations of the popular You Only Look Once (YOLO) algorithm, YOLOR and YOLOX, are trained for classification, with the state-of-the-art Mask R-CNN network used for segmentation. The best performing classification model, YOLOX, achieves a classification mAP50:95 of 90.1% across 7 classes in real time, with an average inference speed of 19.36 ms per image. Segmentation accuracy of AP50 of 70.8% and AR50-95 of 83.5% was achieved using the Mask CNN-R network.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120999309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic point cloud compression using slicing focusing on self-occluded points 动态点云压缩使用切片聚焦自遮挡点

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034563

{"title":"Dynamic point cloud compression using slicing focusing on self-occluded points","authors":"","doi":"10.1109/DICTA56598.2022.10034563","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034563","url":null,"abstract":"Realistic digital representations of 3D objects and surroundings have been recently made possible. This is due to recent advances in computer graphics allowing real-time and realistic physical world interactions with users [1], [2]. Emerging technologies enable real-world objects, persons, and scenes to move dynamically across users' views convincingly using a 3D point cloud [3]–[5]. A point cloud is a set of individual 3D points that are not organized and without any relationship in the 3D space [1], [6]. Each point has a 3D position but can also contain some other attributes (e.g., texture, reflectance, colour, and normal), creating a realistic visual representation model for static and dynamic 3D objects [3], [7]. This is desirable for many applications such as geographic information systems, cultural heritage, immersive telepresence, telehealth, disabled access, 3D telepresence, telecommunication, autonomous driving, gaming and robotics, virtual reality (VR), and augmented reality (AR) [2], [8]. Even the use of point cloud in Metaverse when creating an avatar or content in Metaverse and object-based interaction is required. The Metaverse is a virtual world that creates a network where anyone can interact through their avatars [9]. Therefore, it is critical to present the 3D virtual world as close to the real world as possible, with high-resolution and minimal noise and blur.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123255024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

ComicLib: A New Large-Scale Comic Dataset for Sketch Understanding ComicLib:一个新的用于草图理解的大规模漫画数据集

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034579

引用次数: 0

Salient Face Prediction without Bells and Whistles 突出的面孔预测没有铃铛和口哨

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034571

引用次数: 0

FootSeg: Automatic Anatomical Segmentation of Foot Bones from Weight-Bearing Cone Beam CT Scans FootSeg:从负重锥束CT扫描中自动解剖分割足骨

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034620

引用次数: 0

SimplestNet-Drone: An efficient and Accurate Object Detection Algorithm for Drone Aerial Image Analytics SimplestNet-Drone:一种高效、准确的无人机航拍图像目标检测算法

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034564

{"title":"SimplestNet-Drone: An efficient and Accurate Object Detection Algorithm for Drone Aerial Image Analytics","authors":"","doi":"10.1109/DICTA56598.2022.10034564","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034564","url":null,"abstract":"Images captured by drones are extremely difficult to detect due to varying camera angles, distances, sizes, and environmental conditions, making it challenging to accurately detect an object from a height. Nonetheless, object detection plays a crucial role in computer vision and has made significant improvements to images captured by drones. We apply the YOLOv5 framework with modified feature extraction and focus detection. The problem with aerial images is object size and viewing angle from a high altitude, so we proposed a single-stage object detection model called “SimplestNet-Drone”. We included a fourth prediction head to improve the object detection on the smallest objects and improve the detection speed. The algorithm's prediction accuracy is improved by adding an attention model mechanism, which detects attention regions in environments and suppresses unnecessary information. The model was trained and tested on the VisDorne dataset and compared with other object detection models. The model shows great improvement, with a mean average precision of 63.72%, and has improved the Yolo architecture. A real-time implementation of our model can be watched in the following YouTube video: https://youtu.be/De8t4tjtb6w","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124190405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Co-Graph Convolution for Instance Segmentation 实例分割的协同图卷积

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034643

{"title":"Co-Graph Convolution for Instance Segmentation","authors":"","doi":"10.1109/DICTA56598.2022.10034643","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034643","url":null,"abstract":"Segmenting various instances in various contexts with a common model is a challenge for instance segmentation. In this paper, we address this problem by capturing rich relationship information and propose our Co-Graph Convolution Network (CGC-Net). Based on Mask R-CNN, we propose our co-graph convolution mask head. Specifically, we decouple the mask head into two mask heads. For each mask head, we append a graph convolution layer to capture the corresponding relationship information. One focuses on the relationship information between appearance features for each position of the instance itself, while the other pays more attention to the semantic relationship between each channel for the corresponding instance's features. In addition, we add a co-relationship module to each graph convolution layer to share similar relationships between instances with the same category in an image. We integrate the outputs of two mask heads by element-wise multiplication to improve feature representation for final instance segmentation prediction. Compared with other state-of-the-art instance segmentation methods, experiments on MS COCO and Cityscapes datasets demonstrate our method's competitiveness. Besides, in order to verify the generalization of our CGC-Net, we also add our CGC-Net to other instance segmentation networks, and the experiment results show our method still can obtain stable gains in performance.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130273688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prostate Cancer Diagnosis from Structured Clinical Biomarkers with Deep Learning: Anonymous Authors 从结构化临床生物标志物与深度学习诊断前列腺癌:匿名作者

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034567

引用次数: 0

Image-based Detection of Dyslexic Readers from 2-D Scan path using an Enhanced Deep Transfer Learning Paradigm 使用增强深度迁移学习范式的二维扫描路径中基于图像的失读症读者检测

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034577

{"title":"Image-based Detection of Dyslexic Readers from 2-D Scan path using an Enhanced Deep Transfer Learning Paradigm","authors":"","doi":"10.1109/DICTA56598.2022.10034577","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034577","url":null,"abstract":"Dyslexia is a learning syndrome commonly found in children that causes poor reading and comprehending skills even though they have normal intelligence. Dyslexia is more prevalent among school children. Dyslexia is caused by wide range of features and the exact cause is still unclear which makes it difficult for developing a generalized dyslexia detection model. Feature engineering to extract major features that contribute for generalized capability of the classifier is a significant challenge while developing a classification model for dyslexia. Conventional models for prediction of dyslexia based on psychological assessments, Imaging methods such as Magnetic Resonance Images, functional MRI images and Electroencephalogram (EEG) signals are not usually preferred for clinical disorders such as dyslexia especially on children due to adverse radioactive effects. To overcome these problems, this research work adapts an image-based technique for prediction of dyslexia based on eye gaze points while reading. Eye movement tracking methods are non-invasive and rich indices of brain study and cognitive processing. The eye gaze point while reading is tracked and represented as 2-D scan path images. The work also proposes an enhanced Dense Net deep transfer learning solution for feature engineering and classification of dyslexia. A new approach of enhanced Dense Net deep transfer learning is proposed where a deep learning model is built from 2d-scanpath images of dyslexia. This pre-trained model is used further to classify dyslexia using deep transfer learning. The proposed system uses the key characteristics of deep learning and transfer learning and has shown high performance when compared to existing state-of-the-art machine learning models with a high accuracy rate of 96.36 %. The results demonstrate that the enhanced deep transfer learning model performed well in identifying significant features and classification of dyslexia using 2-D scan path images.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130918195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rethinking Decoupled Training with Bag of Tricks for Long-Tailed Recognition 对长尾识别解耦训练的再思考

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI: 10.1109/DICTA56598.2022.10034607

{"title":"Rethinking Decoupled Training with Bag of Tricks for Long-Tailed Recognition","authors":"","doi":"10.1109/DICTA56598.2022.10034607","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034607","url":null,"abstract":"Learning from imbalanced datasets remains a significant challenge for real-world applications. The decoupled training approach seems to achieve better performance among existing approaches for long-tail recognition. Moreover, there are simple and effective tricks that can be used to further improve the performance of decoupled learning and help models trained on long-tailed datasets to be more robust to the class imbalance problem. However, if used inappropriately, these tricks can result in lower than expected recognition accuracy. Unfortunately, there is a lack of comprehensive empirical studies that provide guidelines on how to combine these tricks appropriately. In this paper, we explore existing long-tail visual recognition tricks and perform extensive experiments to provide a detailed analysis of the impact of each trick and come up with an effective combination of these tricks for decoupled training. Furthermore, we introduce a new loss function called hard mining loss (HML), which is more suitable to learn the model to better discriminate head and tail classes. In addition, unlike previous work, we introduce a new learning scheme for decoupled training following an end-to-end process. We conducted our evaluation experiments on the CIFAR10, CIFAR100 and iNaturalist 2018 datasets. The results11Code is available at the link will be made available. show that our method outperforms existing methods that address class imbalance issue for image classification tasks. We believe that our approach will serve as a solid foundation for improving class imbalance problems in many other computer vision tasks.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"855 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122353844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1