2023 18th International Conference on Machine Vision and Applications (MVA)最新文献

筛选
英文 中文
A Hybrid Wheat Head Detection model with Incorporated CNN and Transformer 结合CNN和变压器的杂交小麦抽穗检测模型
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216087
Shou Harada, Xian-Hua Han
{"title":"A Hybrid Wheat Head Detection model with Incorporated CNN and Transformer","authors":"Shou Harada, Xian-Hua Han","doi":"10.23919/MVA57639.2023.10216087","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216087","url":null,"abstract":"Wheat head detection is an important research topic for production estimation and growth management. Motivated by the great advantages of the deep convolution neural networks (DCNNs) in many vision tasks, the deep-learning based methods have dominated the wheat head detection field, and manifest remarkable performance improvement compared with the traditional image processing methods. The existing methods usually divert the proposed detection models for the generic object detection to wheat head detection, and are insufficient in taking account of the specific characteristics of the wheat head images such as large variations due to different growth stages, high density and overlaps. This work exploits a novel hybrid wheat detection model by incorporating the CNN and transformer for modeling long-range dependence. Specifically, we firstly employ a backbone ResNet to extract multi-scale features, and leverage an inter-scale feature fusion module to aggregate coarse-to-fine features together for capturing sufficient spatial detail to localize small-size wheat head. Moreover, we propose a novel and efficient transformer block by incorporating the self-attention module in channel direction and the feature feed-forward subnet to explore the interaction among the aggregated multi-scale features. Finally a prediction head produces the centerness and size of wheat heads to obtain a simple anchor-free detection model. Extensive experiments on the Global Wheat Head Detection (GWHD) dataset have demonstrated the superiority of our proposed model over the existing state-of-the-art methods as well as the baseline model.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126063810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quadruped Robot Platform for Selective Pesticide Spraying 四足机器人农药选择性喷洒平台
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215812
Hansen Hendra, Yubin Liu, Ryoichi Ishikawa, Takeshi Oishi, Yoshihiro Sato
{"title":"Quadruped Robot Platform for Selective Pesticide Spraying","authors":"Hansen Hendra, Yubin Liu, Ryoichi Ishikawa, Takeshi Oishi, Yoshihiro Sato","doi":"10.23919/MVA57639.2023.10215812","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215812","url":null,"abstract":"Effective control of disease and pest infection is vital for maximizing crop yields, and pesticide spraying is a commonly used method for achieving this goal. This study proposes a novel approach to selective pesticide spraying using a quadruped robot platform, which we tested in a broccoli field. We developed an algorithm to detect and track worms based on our proposed Histogram of Oriented Gradients and Support Vector Machine (HOG-SVM) techniques, integrated with the recent object detection and tracking methods. Our platform was tested by traversing the furrows between the broccoli crop lines and continuously scanning to detect cabbage worms. Our experiments demonstrate that the proposed HOG-SVM algorithm successfully reduced the false positive rate of real-time worm detection by reducing around 90% for the imitation environments and around 60% for the actual field.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126736203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Transfer for Domain Adaptation in Crowd Counting 群体计数中领域自适应的动态传递
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216197
Shekhor Chanda, Yang Wang
{"title":"Dynamic Transfer for Domain Adaptation in Crowd Counting","authors":"Shekhor Chanda, Yang Wang","doi":"10.23919/MVA57639.2023.10216197","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216197","url":null,"abstract":"We consider the problem of domain adaptation in crowd counting. Given a pre-trained model learned from a source domain, our goal is to adapt this model to a target domain using unlabeled data. The solution to this problem has a lot of potential applications in computer vision research that require a neural network model adapted to a target dataset. In this paper, we illustrate a dynamic domain adaptation technique. Specifically, we apply dynamic transfer for solving domain adaptation problems in crowd counting. The key insight is that adapting the model for the target domain is achieved by adapting the model across the data samples. The experimental results on several benchmark datasets demonstrate the effectiveness of our approaches.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124641108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Pose Prediction by Progressive Generation in Multi-scale Frequency Domain 基于多尺度频域渐进生成的人体姿态预测
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215966
Tomohiro Fujita, Yasutomo Kawanishi
{"title":"Human Pose Prediction by Progressive Generation in Multi-scale Frequency Domain","authors":"Tomohiro Fujita, Yasutomo Kawanishi","doi":"10.23919/MVA57639.2023.10215966","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215966","url":null,"abstract":"We address a problem of 3D human pose prediction from a sequence of human body skeletons. To model the spatio-temporal dynamics, the discrete cosine transform (DCT) and the graph convolutional networks (GCN) are often applied to signals on a human skeleton graph. By DCT, temporal information of a human skeleton sequence can be embedded into the frequency domain. However, in previous studies, the prediction models using DCT implicitly learned each frequency coefficient by gradients calculated from a loss of the predictions and the ground truths of human body skeletons. In this paper, we propose a progressive human pose prediction model in frequency domain so that explicitly predict high-, medium-, and low-frequency motion of a target person. We confirmed that the proposed method improves prediction accuracy through experiments using public datasets on Human3.6M and CMU Mocap datasets.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129680157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Achieving Lightweight Deep Neural Network for Precision Agriculture with Maize Disease Detection 基于玉米病害检测的精准农业轻量深度神经网络研究
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215815
C. Padeiro, Takahiro Komamizu, I. Ide
{"title":"Towards Achieving Lightweight Deep Neural Network for Precision Agriculture with Maize Disease Detection","authors":"C. Padeiro, Takahiro Komamizu, I. Ide","doi":"10.23919/MVA57639.2023.10215815","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215815","url":null,"abstract":"Agriculture is the pillar industry of human survival. However, various crop diseases reduce the human food supply and lead to starvation and death in the worst cases. Experts perform visual symptoms observation for crop disease diagnosis. Which process is time-consuming and expensive. Also, the process has significant risk of human error due to subjective perception. Convolutional Neural Networks (CNN) use image processing techniques to show great potential in plant disease detection. However, it requires thousands of channels to learn rich features, resulting in large models requiring powerful computing, power supply, and high bandwidth, making it more expensive and difficult for farmers to acquire. Therefore, deploying these solutions on resource-constrained devices is desirable to make them more accessible. Thus, we propose a lightweight object detection CNN that can run on resource-constrained devices to detect crop diseases. Channel pruning is applied to optimize resource use by removing unimportant channels and filter weights to reduce network parameters, inference time, and the number of FLOPS. Experimental results with object detector, Faster R-CNN with two backbones, ResNet-50, and EfficientNet-B7, show significant improvement in model efficiency, keeping high accuracy.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114542446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MVA 2023 Cover Page MVA 2023封面
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/mva57639.2023.10216272
{"title":"MVA 2023 Cover Page","authors":"","doi":"10.23919/mva57639.2023.10216272","DOIUrl":"https://doi.org/10.23919/mva57639.2023.10216272","url":null,"abstract":"","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"264 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116220295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diabetic Retinopathy Grading based on a Sparse Network Fusion of Heterogeneous ConvNeXt Models with Category Attention 基于类别关注的异构ConvNeXt模型稀疏网络融合的糖尿病视网膜病变分级
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216129
Agustin Castillo-Munguia, Gibran Benitez-Garcia, J. Olivares-Mercado, Hiroki Takahashi
{"title":"Diabetic Retinopathy Grading based on a Sparse Network Fusion of Heterogeneous ConvNeXt Models with Category Attention","authors":"Agustin Castillo-Munguia, Gibran Benitez-Garcia, J. Olivares-Mercado, Hiroki Takahashi","doi":"10.23919/MVA57639.2023.10216129","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216129","url":null,"abstract":"Diabetic retinopathy (DR) is an eye disease caused by high blood sugar levels that may damage vessels in the retina, leading to partial or complete loss of vision in later stages. In recent years, convolutional neural networks (CNN) have been used to help diagnose the DR severity. However, due to the slight differences between each class and the imbalanced nature of the datasets, standard CNNs often struggle to distinguish accurately between different grades of DR. To overcome these challenges, we propose combining a novel CNN model (ConvNeXt) with category-attention blocks incorporated at multiple levels of the architecture. This generates different models that can effectively extract fine-grained features and minimize the impact of dataset imbalance. Finally, we introduce a Sparse Network Fusion technique that learns to combine the outputs of all models to consolidate their individual decisions. Extensive experiments on the challenging DDR dataset show that our proposal achieves a new state-of-the-art performance, improving by about 3% grading accuracy compared with existing methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114482554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object Detection for Embedded Systems Using Tiny Spiking Neural Networks: Filtering Noise Through Visual Attention 基于微脉冲神经网络的嵌入式系统目标检测:通过视觉注意过滤噪声
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215590
Hugo Bulzomi, Amélie Gruel, Jean Martinet, Takeshi Fujita, Yuta Nakano, R. Bendahan
{"title":"Object Detection for Embedded Systems Using Tiny Spiking Neural Networks: Filtering Noise Through Visual Attention","authors":"Hugo Bulzomi, Amélie Gruel, Jean Martinet, Takeshi Fujita, Yuta Nakano, R. Bendahan","doi":"10.23919/MVA57639.2023.10215590","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215590","url":null,"abstract":"Object detection is an important task becoming increasingly common in numerous applications for embedded systems. The traditional state-of-the-art deep neural networks (DNNs) tend to be incompatible with the limitations of many of those systems: their large size and high computational cost make them hard to deploy on hardware with limited resources. Spiking Neural Networks (SNNs) have been attracting attention in recent years because of their potential as energy-efficient alternatives when implemented on specialized hardware, and their smooth integration with energy-efficient event cameras. In this paper, we present a lightweight SNN architecture for efficient object detection in embedded systems using event camera data. We show that by applying visual attention mechanisms, we can ignore most of the noise from the input and thus reduce the number of neurons and activations since additional noise-filtering layers are not needed. Our proposed SNN is 24 times smaller than a previous similar method for our input resolution and maintains similar overall detection performances, while being more robust to noise. We finally demonstrate the energy efficiency of our network during runtime with an implementation on SpiNNaker chip, showing the applicability of our approach.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129331898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Static Specular Flow and Highlight with Deep Features for Specular Surface Detection 结合静态高光流和高光与深度特征的高光表面检测
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/mva57639.2023.10215694
Hirotaka Hachiya, Yuto Yoshimura
{"title":"Combining Static Specular Flow and Highlight with Deep Features for Specular Surface Detection","authors":"Hirotaka Hachiya, Yuto Yoshimura","doi":"10.23919/mva57639.2023.10215694","DOIUrl":"https://doi.org/10.23919/mva57639.2023.10215694","url":null,"abstract":"To apply robot teaching to a factory with many mirror-polished parts, it is necessary to detect the mirror-like surface accurately. Deep models for mirror detection have been studied by designing mirror-specific features, e.g., contextual contrast and similarity. However, the mirror-polished parts, e.g., plastic molds, tend to have complex shapes and ambiguous boundaries, and thus existing mirror-specific deep features could not work well. To detect such complex mirror-like surfaces, we propose combining static specular flow and highlight, frequently appearing in specular surfaces, with deep model-based multi-level feature pyramids and adaptively integrating multiple feature maps, including mirror-specific ones. Through experiments with our original real-world plastic mold dataset, we show the effectiveness of the proposed method.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123790371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shape Preservation in Image Style Transfer for Gaze Estimation 注视估计中图像样式转移中的形状保持
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216216
Daiki Mushiake, Kentaro Otomo, Chihiro Nakatani, N. Ukita
{"title":"Shape Preservation in Image Style Transfer for Gaze Estimation","authors":"Daiki Mushiake, Kentaro Otomo, Chihiro Nakatani, N. Ukita","doi":"10.23919/MVA57639.2023.10216216","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216216","url":null,"abstract":"This paper proposes image style transfer with shape preservation for gaze estimation. While several shape preservation constraints are proposed, we present additional shape preservation constraints using (i) dense pixelwise correspondences between the original and its transferred images and (ii) task-driven learning using gaze estimation error for directly improving gaze direction estimation. A variety of experiments with other SOTA methods, publicly-available datasets, and ablation studies validate the effectiveness of our method.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129692382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信