{"title":"图像中文化事件识别的混合融合","authors":"Shivansh Srivastava, Bappaditya Mandal, Anirban Chakraborty","doi":"10.1049/icp.2021.1455","DOIUrl":null,"url":null,"abstract":"Understanding high level semantic concepts in images requires information from various modalities of visual concepts. One such task is recognition of events based on still images, which requires simultaneous reasoning about high level semantic concepts like objects, people, scenes and their interactions. In this work, we explore different strategies to fuse object and scene information in images to aid the task of cultural event recognition. We start with early and late fusion strategies to combine object and scene level information to reason about event classes. To support our hypothesis that early fused models are able to extract complementary object and scene information, we propose the use of guided backpropagation method to visualize image activations. Inspection of image activations gives an essence of object-scene complementarity in case of early fusion which is not observed in the case of late fused models. As extensions to early and late fusion techniques, we propose HFCER, a hybrid fusion framework along with an alternating training scheme. The proposed technique shows improvement over its late and early fusion counterparts. Late fusion of different fusion techniques namely late, early and hybrid fusion shows state of the art results on Chalearn LAP cultural event recognition dataset.","PeriodicalId":431144,"journal":{"name":"11th International Conference of Pattern Recognition Systems (ICPRS 2021)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HFCER : Hybrid Fusion for Cultural Event Recognition in Images\",\"authors\":\"Shivansh Srivastava, Bappaditya Mandal, Anirban Chakraborty\",\"doi\":\"10.1049/icp.2021.1455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding high level semantic concepts in images requires information from various modalities of visual concepts. One such task is recognition of events based on still images, which requires simultaneous reasoning about high level semantic concepts like objects, people, scenes and their interactions. In this work, we explore different strategies to fuse object and scene information in images to aid the task of cultural event recognition. We start with early and late fusion strategies to combine object and scene level information to reason about event classes. To support our hypothesis that early fused models are able to extract complementary object and scene information, we propose the use of guided backpropagation method to visualize image activations. Inspection of image activations gives an essence of object-scene complementarity in case of early fusion which is not observed in the case of late fused models. As extensions to early and late fusion techniques, we propose HFCER, a hybrid fusion framework along with an alternating training scheme. The proposed technique shows improvement over its late and early fusion counterparts. Late fusion of different fusion techniques namely late, early and hybrid fusion shows state of the art results on Chalearn LAP cultural event recognition dataset.\",\"PeriodicalId\":431144,\"journal\":{\"name\":\"11th International Conference of Pattern Recognition Systems (ICPRS 2021)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"11th International Conference of Pattern Recognition Systems (ICPRS 2021)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/icp.2021.1455\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"11th International Conference of Pattern Recognition Systems (ICPRS 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/icp.2021.1455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
HFCER : Hybrid Fusion for Cultural Event Recognition in Images
Understanding high level semantic concepts in images requires information from various modalities of visual concepts. One such task is recognition of events based on still images, which requires simultaneous reasoning about high level semantic concepts like objects, people, scenes and their interactions. In this work, we explore different strategies to fuse object and scene information in images to aid the task of cultural event recognition. We start with early and late fusion strategies to combine object and scene level information to reason about event classes. To support our hypothesis that early fused models are able to extract complementary object and scene information, we propose the use of guided backpropagation method to visualize image activations. Inspection of image activations gives an essence of object-scene complementarity in case of early fusion which is not observed in the case of late fused models. As extensions to early and late fusion techniques, we propose HFCER, a hybrid fusion framework along with an alternating training scheme. The proposed technique shows improvement over its late and early fusion counterparts. Late fusion of different fusion techniques namely late, early and hybrid fusion shows state of the art results on Chalearn LAP cultural event recognition dataset.