2022 International Conference on Machine Vision and Image Processing (MVIP)最新文献_第5页

Tumor Detection in Brain MRI using Residual Convolutional Neural Networks 残差卷积神经网络在脑MRI肿瘤检测中的应用

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738767

Mohammad Reza Obeidavi, K. Maghooli

引用次数: 3

Deep Autoencoder Multi-Exposure HDR Imaging 深度自动编码器多曝光HDR成像

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738552

A. Omrani, M. Soheili, M. Kelarestaghi

引用次数: 0

Light Face: A Light Face Detector for Edge Devices 光面:用于边缘设备的光面检测器

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738740

Saeed Khanehgir, Amir Mohammad Ghoreyshi, Alireza Akbari, R. Derakhshan, M. Sabokrou

{"title":"Light Face: A Light Face Detector for Edge Devices","authors":"Saeed Khanehgir, Amir Mohammad Ghoreyshi, Alireza Akbari, R. Derakhshan, M. Sabokrou","doi":"10.1109/MVIP53647.2022.9738740","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738740","url":null,"abstract":"Face detection is one of the most important and basic steps in the recognition and verification of human identity. Using models based on convolutional networks such as face detection models is very difficult and challenging due to a large number of parameters, computational complexity, and high power consumption in environments such as edge devices, mobiles with limited memory storage resources, and low computing power. In this paper, a light and fast face detection model is proposed to predict the face boxes with real-time speed and high accuracy. The proposed model is structured based on the YOLO algorithm and CSPDarknet53 tiny backbone. Some tricks such as calculating custom anchor boxes aimed to solve the detection problem of varying face scales and some optimization techniques such as pruning and quantization have also been used to optimize and reduce the number of parameters and improve the speed to make the final model strong and suitable for use in environments with low computational power. One of our best models with a MAP of 67.52% on the WIDER FACE dataset and a volume of 1.7 Mb and a speed of 1.43 FPS on a mobile phone with ordinary hardware has shown significant performance","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125695796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A face detection method via ensemble of four versions of YOLOs 一种基于四个版本的yolo集合的人脸检测方法

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738779

Sanaz Khalili, A. Shakiba

引用次数: 5

Real-Time Facial Expression Recognition using Facial Landmarks and Neural Networks 基于面部地标和神经网络的实时面部表情识别

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2022-01-31 DOI: 10.1109/MVIP53647.2022.9738754

M. Haghpanah, Ehsan Saeedizade, M. T. Masouleh, A. Kalhor

引用次数: 4

Deep Curriculum Learning for PolSAR Image Classification 基于深度课程学习的PolSAR图像分类

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2021-12-26 DOI: 10.1109/MVIP53647.2022.9738781

Hamid Mousavi, M. Imani, H. Ghassemian

引用次数: 4

Towards Fine-grained Image Classification with Generative Adversarial Networks and Facial Landmark Detection 基于生成对抗网络和面部地标检测的细粒度图像分类

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2021-08-28 DOI: 10.1109/MVIP53647.2022.9738759

Mahdieh Darvish, Mahsa Pouramini, H. Bahador

{"title":"Towards Fine-grained Image Classification with Generative Adversarial Networks and Facial Landmark Detection","authors":"Mahdieh Darvish, Mahsa Pouramini, H. Bahador","doi":"10.1109/MVIP53647.2022.9738759","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738759","url":null,"abstract":"Fine-grained classification remains a challenging task because distinguishing categories needs learning complex and local differences. Diversity in the pose, scale, and position of objects in an image makes the problem even more difficult. Although the recent Vision Transformer models achieve high performance, they need an extensive volume of input data. To encounter this problem, we made the best use of GAN-based data augmentation to generate extra dataset instances. Oxford-IIIT Pets was our dataset of choice for this experiment. It consists of 37 breeds of cats and dogs with variations in scale, poses, and lighting, which intensifies the difficulty of the classification task. Furthermore, we enhanced the performance of the recent Generative Adversarial Network (GAN), StyleGAN2-ADA model to generate more realistic images while preventing overfitting to the training set. We did this by training a customized version of MobileNetV2 to predict animal facial landmarks; then, we cropped images accordingly. Lastly, we combined the synthetic images with the original dataset and compared our proposed method with standard GANs augmentation and no augmentation with different subsets of training data. We validated our work by evaluating the accuracy of fine-grained image classification on the recent Vision Transformer (ViT) Model. Code is available at: https://github.com/mahdi-darvish/GAN-augmented-pet-classifler","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132779953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Exploring the Properties and Evolution of Neural Network Eigenspaces during Training 神经网络特征空间在训练过程中的性质与演化

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2021-06-17 DOI: 10.1109/MVIP53647.2022.9738741

Mats L. Richter, Leila Malihi, Anne-Kathrin Patricia Windler, U. Krumnack

引用次数: 2

Lip reading using external viseme decoding 唇读使用外部viseme解码

2022 International Conference on Machine Vision and Image Processing (MVIP) Pub Date : 2021-04-10 DOI: 10.1109/MVIP53647.2022.9738749

J. Peymanfard, M. R. Mohammadi, Hossein Zeinali, N. Mozayani

引用次数: 6