{"title":"识别真实图像中的视觉合成","authors":"Lin Bai, Kan Li, Shuai Jiang","doi":"10.1109/IJCNN.2015.7280523","DOIUrl":null,"url":null,"abstract":"Automatically discovering and recognizing the main structured visual pattern of an image is a challenging problem. The most difficulties are how to find the component objects and how to recognize the interaction among these objects. The component objects of the structured visual pattern have consistent 3D spatial co-occurrence layout across images, which manifest themselves as a predictable pattern called visual composite. In this paper, we propose a visual composite recognition model to automatically discover and recognize the visual composite of an image. Our model firstly learns 3D spatial co-occurrence statistics among objects to discover the potential structured visual pattern of an image so that it captures the component objects of visual composite. Secondly, we construct a feedforward architecture using the proposed factored three-way interaction machine to recognize the visual composite, which casts the recognition problem as a structured prediction task. It predicts the visual composite by maximizing the probability of the correct structured label given the component objects and their 3D spatial context. Experiments conducted on a six-class sports dataset and a phrasal recognition dataset respectively demonstrate the encouraging performance of our model in discovery precision and recognition accuracy compared with competing approaches.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"1 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Recognizing visual composite in real images\",\"authors\":\"Lin Bai, Kan Li, Shuai Jiang\",\"doi\":\"10.1109/IJCNN.2015.7280523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatically discovering and recognizing the main structured visual pattern of an image is a challenging problem. The most difficulties are how to find the component objects and how to recognize the interaction among these objects. The component objects of the structured visual pattern have consistent 3D spatial co-occurrence layout across images, which manifest themselves as a predictable pattern called visual composite. In this paper, we propose a visual composite recognition model to automatically discover and recognize the visual composite of an image. Our model firstly learns 3D spatial co-occurrence statistics among objects to discover the potential structured visual pattern of an image so that it captures the component objects of visual composite. Secondly, we construct a feedforward architecture using the proposed factored three-way interaction machine to recognize the visual composite, which casts the recognition problem as a structured prediction task. It predicts the visual composite by maximizing the probability of the correct structured label given the component objects and their 3D spatial context. Experiments conducted on a six-class sports dataset and a phrasal recognition dataset respectively demonstrate the encouraging performance of our model in discovery precision and recognition accuracy compared with competing approaches.\",\"PeriodicalId\":6539,\"journal\":{\"name\":\"2015 International Joint Conference on Neural Networks (IJCNN)\",\"volume\":\"1 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Joint Conference on Neural Networks (IJCNN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN.2015.7280523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2015.7280523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatically discovering and recognizing the main structured visual pattern of an image is a challenging problem. The most difficulties are how to find the component objects and how to recognize the interaction among these objects. The component objects of the structured visual pattern have consistent 3D spatial co-occurrence layout across images, which manifest themselves as a predictable pattern called visual composite. In this paper, we propose a visual composite recognition model to automatically discover and recognize the visual composite of an image. Our model firstly learns 3D spatial co-occurrence statistics among objects to discover the potential structured visual pattern of an image so that it captures the component objects of visual composite. Secondly, we construct a feedforward architecture using the proposed factored three-way interaction machine to recognize the visual composite, which casts the recognition problem as a structured prediction task. It predicts the visual composite by maximizing the probability of the correct structured label given the component objects and their 3D spatial context. Experiments conducted on a six-class sports dataset and a phrasal recognition dataset respectively demonstrate the encouraging performance of our model in discovery precision and recognition accuracy compared with competing approaches.