人员检测器训练综合模型的生成与评价

Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez
{"title":"人员检测器训练综合模型的生成与评价","authors":"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez","doi":"10.1109/CCST.2017.8167818","DOIUrl":null,"url":null,"abstract":"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.","PeriodicalId":371622,"journal":{"name":"2017 International Carnahan Conference on Security Technology (ICCST)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generation and evaluation of synthetic models for training people detectors\",\"authors\":\"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez\",\"doi\":\"10.1109/CCST.2017.8167818\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.\",\"PeriodicalId\":371622,\"journal\":{\"name\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCST.2017.8167818\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Carnahan Conference on Security Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCST.2017.8167818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在视频监控领域,特别是在人员检测方面有着巨大的需求,这导致了该领域的研究和资源的大量增加。由于训练图像和注释并不总是可用的,因此考虑创建检测器模型所涉及的成本是很重要的。例如,对于老年人的检测,探测器必须考虑到不同的位置,如站着、坐着、坐在轮椅上等。因此,这项工作的主要目标是减少生成检测模型所需的资源量,节省必须记录新序列并为检测器训练生成相关注释的成本。为了实现这一目标,我们创建了三个合成图像数据集来训练三个不同的模型,评估哪一个模型是最优的,最后通过将其与使用真实图像训练的轮椅使用者的人检测器进行比较,分析其可行性。其他可以应用该技术的人员检测场景,例如,骑马或骑摩托车的人,或推着超市购物车的人。合成数据集是通过将站立的人图像与轮椅图像结合,结合图像patch,对人的部分(躯干、腿部等)进行分割添加到轮椅图像中生成的。正如预期的那样,获得的结果降低了效率(在21%到25%之间),以换取大量节省人工注释和记录真实图像的资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Generation and evaluation of synthetic models for training people detectors
There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信