Selection of Hyperparameters and Data Augmentation Method for Diverse Backbone Models Mask R-CNN

A. D. Egorov, M. S. Reznik
{"title":"Selection of Hyperparameters and Data Augmentation Method for Diverse Backbone Models Mask R-CNN","authors":"A. D. Egorov, M. S. Reznik","doi":"10.1109/CTS53513.2021.9562845","DOIUrl":null,"url":null,"abstract":"Among the most difficult computer vision tasks is one of detecting object's action. Solving that problem, it is needed to be aware of the position of the key points of a particular type of an object. Information about key points position uses to management decision making in technical systems. It is also being complicated task with the fact that training models able to detect the key points require a significant amount of complexly organized data. This paper focuses on finding a solution to the problem of detecting the position of biological object key points. That information is useful in terms of object's actions classification as well as for tracking them. Due to the lack of data for training, a method for obtaining additional data for training is suggested (data augmentation), also various types of backbone models are tested within the R-CNN networks on differently augmented data, with different optimizers, learning rate, number of training epochs and batches. Achieved accuracy on the test sample is more than 90%. The use of backbone models of the ResNet family allowed to achieve greater accuracy of work, which was more than 93%, while the use of reference models from the MobileNet family with an accuracy of about 90% allowed to achieve a processing speed of each frame three times higher (on average) than while using backbone models of the ResNet family.","PeriodicalId":371882,"journal":{"name":"2021 IV International Conference on Control in Technical Systems (CTS)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IV International Conference on Control in Technical Systems (CTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTS53513.2021.9562845","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Among the most difficult computer vision tasks is one of detecting object's action. Solving that problem, it is needed to be aware of the position of the key points of a particular type of an object. Information about key points position uses to management decision making in technical systems. It is also being complicated task with the fact that training models able to detect the key points require a significant amount of complexly organized data. This paper focuses on finding a solution to the problem of detecting the position of biological object key points. That information is useful in terms of object's actions classification as well as for tracking them. Due to the lack of data for training, a method for obtaining additional data for training is suggested (data augmentation), also various types of backbone models are tested within the R-CNN networks on differently augmented data, with different optimizers, learning rate, number of training epochs and batches. Achieved accuracy on the test sample is more than 90%. The use of backbone models of the ResNet family allowed to achieve greater accuracy of work, which was more than 93%, while the use of reference models from the MobileNet family with an accuracy of about 90% allowed to achieve a processing speed of each frame three times higher (on average) than while using backbone models of the ResNet family.
屏蔽R-CNN的多骨干模型超参数选择及数据增强方法
其中最困难的计算机视觉任务之一是检测物体的动作。要解决这个问题,需要了解特定类型对象的关键点的位置。关键位置信息用于技术系统的管理决策。这也是一项复杂的任务,因为能够检测关键点的训练模型需要大量复杂组织的数据。本文的研究重点是寻找一种解决生物目标关键点位置检测问题的方法。这些信息对于对象的动作分类和跟踪非常有用。由于缺乏训练数据,提出了一种获取额外训练数据的方法(数据增强),并在R-CNN网络内对不同类型的骨干模型在不同的增强数据上进行了测试,使用不同的优化器、学习率、训练epoch数和批次。在测试样品上达到了90%以上的准确度。使用ResNet家族的骨干模型可以实现更高的工作精度,超过93%,而使用来自MobileNet家族的参考模型,其精度约为90%,可以实现每帧的处理速度(平均)比使用ResNet家族的骨干模型高三倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信