Mingda Han;Huanqi Yang;Mingda Jia;Weitao Xu;Yanni Yang;Zhijian Huang;Jun Luo;Xiuzhen Cheng;Pengfei Hu
{"title":"看见看不见的东西:利用 COTS 毫米波雷达恢复监控视频","authors":"Mingda Han;Huanqi Yang;Mingda Jia;Weitao Xu;Yanni Yang;Zhijian Huang;Jun Luo;Xiuzhen Cheng;Pengfei Hu","doi":"10.1109/TMC.2024.3445507","DOIUrl":null,"url":null,"abstract":"Video surveillance systems play a crucial role in ensuring public safety and security by capturing and monitoring critical events in various areas. However, traditional surveillance cameras face limitations when it comes to malicious physical damage or obscuring by offenders. To overcome this limitation, we propose \n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\n, which is the first millimeter-wave (mmWave)-based video reconstruction system designed to enhance existing video surveillance cameras. \n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\n utilizes mmWave to sense the profile and motion signature of the target, integrating it with previously acquired visual data about the environment and the target's appearance, thereby facilitating the reconstruction of surveillance video. Specifically, our proposed system incorporates a dual-stage mmWave signal denoising algorithm to efficiently eliminate the noise and multiple-input multiple-output virtual antenna enhanced heatmap generation (MVAE-HG) method to obtain fine-grained mmWave heatmaps responsive to the target's profile and motion information. Moreover, we design the mm2Video generative network that first employs a multi-modal fusion module to fuse the mmWave and pre-acquired visual data, then use a conditional generative adversarial network (cGAN)-based video reconstruction module for surveillance video reconstruction. We conducted comprehensive experiments on \n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\n using a commercial mmWave radar and four surveillance cameras across various environments, with the participation of seven individuals. Evaluation results show that \n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\n can achieve an average structural similarity index measure (SSIM) of 0.93, demonstrating its effectiveness and potential.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"23 12","pages":"14592-14606"},"PeriodicalIF":7.7000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Seeing the Invisible: Recovering Surveillance Video With COTS mmWave Radar\",\"authors\":\"Mingda Han;Huanqi Yang;Mingda Jia;Weitao Xu;Yanni Yang;Zhijian Huang;Jun Luo;Xiuzhen Cheng;Pengfei Hu\",\"doi\":\"10.1109/TMC.2024.3445507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video surveillance systems play a crucial role in ensuring public safety and security by capturing and monitoring critical events in various areas. However, traditional surveillance cameras face limitations when it comes to malicious physical damage or obscuring by offenders. To overcome this limitation, we propose \\n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\\n, which is the first millimeter-wave (mmWave)-based video reconstruction system designed to enhance existing video surveillance cameras. \\n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\\n utilizes mmWave to sense the profile and motion signature of the target, integrating it with previously acquired visual data about the environment and the target's appearance, thereby facilitating the reconstruction of surveillance video. Specifically, our proposed system incorporates a dual-stage mmWave signal denoising algorithm to efficiently eliminate the noise and multiple-input multiple-output virtual antenna enhanced heatmap generation (MVAE-HG) method to obtain fine-grained mmWave heatmaps responsive to the target's profile and motion information. Moreover, we design the mm2Video generative network that first employs a multi-modal fusion module to fuse the mmWave and pre-acquired visual data, then use a conditional generative adversarial network (cGAN)-based video reconstruction module for surveillance video reconstruction. We conducted comprehensive experiments on \\n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\\n using a commercial mmWave radar and four surveillance cameras across various environments, with the participation of seven individuals. Evaluation results show that \\n<sc>m<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula> Vision</small>\\n can achieve an average structural similarity index measure (SSIM) of 0.93, demonstrating its effectiveness and potential.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"23 12\",\"pages\":\"14592-14606\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10638821/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10638821/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Seeing the Invisible: Recovering Surveillance Video With COTS mmWave Radar
Video surveillance systems play a crucial role in ensuring public safety and security by capturing and monitoring critical events in various areas. However, traditional surveillance cameras face limitations when it comes to malicious physical damage or obscuring by offenders. To overcome this limitation, we propose
m$^{2}$2 Vision
, which is the first millimeter-wave (mmWave)-based video reconstruction system designed to enhance existing video surveillance cameras.
m$^{2}$2 Vision
utilizes mmWave to sense the profile and motion signature of the target, integrating it with previously acquired visual data about the environment and the target's appearance, thereby facilitating the reconstruction of surveillance video. Specifically, our proposed system incorporates a dual-stage mmWave signal denoising algorithm to efficiently eliminate the noise and multiple-input multiple-output virtual antenna enhanced heatmap generation (MVAE-HG) method to obtain fine-grained mmWave heatmaps responsive to the target's profile and motion information. Moreover, we design the mm2Video generative network that first employs a multi-modal fusion module to fuse the mmWave and pre-acquired visual data, then use a conditional generative adversarial network (cGAN)-based video reconstruction module for surveillance video reconstruction. We conducted comprehensive experiments on
m$^{2}$2 Vision
using a commercial mmWave radar and four surveillance cameras across various environments, with the participation of seven individuals. Evaluation results show that
m$^{2}$2 Vision
can achieve an average structural similarity index measure (SSIM) of 0.93, demonstrating its effectiveness and potential.
期刊介绍:
IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.