V. Ghassab, Kamal Maanicshah, N. Bouguila, Paul Green
{"title":"REP-Model: A deep learning framework for replacing ad billboards in soccer videos","authors":"V. Ghassab, Kamal Maanicshah, N. Bouguila, Paul Green","doi":"10.1109/ISM.2020.00032","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel framework for replacing advertisement contents in soccer videos with an automatic way by using deep learning strategies. We begin by applying UNET (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), we will replace the unwanted content by new one using a homography mapping procedure. Furthermore, the replacement key points in each frame will be tracked into the next frames considering the camera zoom-in and zoom-out controlling. Since the movement of objects in video can disrupt the alignment between frames and correspondingly make the homography matrix calculation erroneous, we use Mask R-CNN to mask and remove the moving objects from the scene. Such framework is denominated as REP-Model which stands for a replacing model.","PeriodicalId":120972,"journal":{"name":"2020 IEEE International Symposium on Multimedia (ISM)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Multimedia (ISM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2020.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose a novel framework for replacing advertisement contents in soccer videos with an automatic way by using deep learning strategies. We begin by applying UNET (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), we will replace the unwanted content by new one using a homography mapping procedure. Furthermore, the replacement key points in each frame will be tracked into the next frames considering the camera zoom-in and zoom-out controlling. Since the movement of objects in video can disrupt the alignment between frames and correspondingly make the homography matrix calculation erroneous, we use Mask R-CNN to mask and remove the moving objects from the scene. Such framework is denominated as REP-Model which stands for a replacing model.