利用变形图和密度图改进计数结果的人群计数注意事项

2021 8th NAFOSTED Conference on Information and Computer Science (NICS) Pub Date : 2021-12-21 DOI:10.1109/NICS54270.2021.9701500

P. Do

{"title":"利用变形图和密度图改进计数结果的人群计数注意事项","authors":"P. Do","doi":"10.1109/NICS54270.2021.9701500","DOIUrl":null,"url":null,"abstract":"With the vigorous development of CNN, most crowd counting methods have approached using CNN to estimate the density map and then infer the count. However, these methods face many limitations due to limited receptive fields, background noise, etc. With the advent of Transformer in natural language processing, it is possible to utilize this model for the crowd counting problem. The Transformer can model the global context, so it helps to solve the problem of receptive fields. On the other hand, with the attention mechanism, the model can focus on areas of concentration of people, helping to solve the problem of background noise. In this paper, we propose a Crowd counting model combining Transformer and Density map (TDCrowd) to estimate the number of people in a crowd. With the use of a Transformer, TDCrowd can still be trained so that it does not need information about the location of people in the crowd, but only information about the count. Experiments on three datasets ShanghaiTech, UCF_QNR, and JHU-Crowd++, show that TDCrowd gives better results when compared to regression-based methods (need only the count information) and density map-based (need the count information and location information).","PeriodicalId":296963,"journal":{"name":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Attention in Crowd Counting Using the Transformer and Density Map to Improve Counting Result\",\"authors\":\"P. Do\",\"doi\":\"10.1109/NICS54270.2021.9701500\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the vigorous development of CNN, most crowd counting methods have approached using CNN to estimate the density map and then infer the count. However, these methods face many limitations due to limited receptive fields, background noise, etc. With the advent of Transformer in natural language processing, it is possible to utilize this model for the crowd counting problem. The Transformer can model the global context, so it helps to solve the problem of receptive fields. On the other hand, with the attention mechanism, the model can focus on areas of concentration of people, helping to solve the problem of background noise. In this paper, we propose a Crowd counting model combining Transformer and Density map (TDCrowd) to estimate the number of people in a crowd. With the use of a Transformer, TDCrowd can still be trained so that it does not need information about the location of people in the crowd, but only information about the count. Experiments on three datasets ShanghaiTech, UCF_QNR, and JHU-Crowd++, show that TDCrowd gives better results when compared to regression-based methods (need only the count information) and density map-based (need the count information and location information).\",\"PeriodicalId\":296963,\"journal\":{\"name\":\"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NICS54270.2021.9701500\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS54270.2021.9701500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

随着CNN的蓬勃发展，大多数人群计数方法都接近于使用CNN来估计密度图，然后推断计数。然而，这些方法由于受接收野、背景噪声等因素的限制而面临许多局限性。随着自然语言处理中Transformer的出现，将该模型用于人群计数问题成为可能。Transformer可以对全局上下文进行建模，因此它有助于解决接收域的问题。另一方面，通过注意机制，该模型可以将注意力集中在人群集中的区域，有助于解决背景噪音问题。在本文中，我们提出了一种结合变压器和密度图的人群计数模型(TDCrowd)来估计人群中的人数。通过使用Transformer, TDCrowd仍然可以进行训练，这样它就不需要关于人群中人的位置的信息，而只需要关于计数的信息。在ShanghaiTech、UCF_QNR和JHU-Crowd++三个数据集上的实验表明，TDCrowd方法比基于回归的方法(只需要计数信息)和基于密度图的方法(需要计数信息和位置信息)获得了更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Attention in Crowd Counting Using the Transformer and Density Map to Improve Counting Result

With the vigorous development of CNN, most crowd counting methods have approached using CNN to estimate the density map and then infer the count. However, these methods face many limitations due to limited receptive fields, background noise, etc. With the advent of Transformer in natural language processing, it is possible to utilize this model for the crowd counting problem. The Transformer can model the global context, so it helps to solve the problem of receptive fields. On the other hand, with the attention mechanism, the model can focus on areas of concentration of people, helping to solve the problem of background noise. In this paper, we propose a Crowd counting model combining Transformer and Density map (TDCrowd) to estimate the number of people in a crowd. With the use of a Transformer, TDCrowd can still be trained so that it does not need information about the location of people in the crowd, but only information about the count. Experiments on three datasets ShanghaiTech, UCF_QNR, and JHU-Crowd++, show that TDCrowd gives better results when compared to regression-based methods (need only the count information) and density map-based (need the count information and location information).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

自引率

0.00%

发文量