Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang
{"title":"Research on Pedestrian Detection Based on Multimodal Infor-mation Fusion","authors":"Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang","doi":"10.5755/j01.itc.52.4.33766","DOIUrl":null,"url":null,"abstract":"Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"17 3","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Technology and Control","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.5755/j01.itc.52.4.33766","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.
期刊介绍:
Periodical journal covers a wide field of computer science and control systems related problems including:
-Software and hardware engineering;
-Management systems engineering;
-Information systems and databases;
-Embedded systems;
-Physical systems modelling and application;
-Computer networks and cloud computing;
-Data visualization;
-Human-computer interface;
-Computer graphics, visual analytics, and multimedia systems.