A. M. Obeso, M. García-Vázquez, A. A. Ramírez-Acosta, J. Benois-Pineau
{"title":"鉴赏家:用深度学习和视觉注意力预测对墨西哥建筑遗产的风格进行分类","authors":"A. M. Obeso, M. García-Vázquez, A. A. Ramírez-Acosta, J. Benois-Pineau","doi":"10.1145/3095713.3095730","DOIUrl":null,"url":null,"abstract":"The automatic description of multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance for application to this method. Our problem is classification of architectural styles of buildings in digital photographs of Mexican cultural heritage. The selection of relevant content in the scene for training classification models allows them to be more precise in the classification task. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Convolutional Neural Network to identify the architectural style of Mexican buildings. Also, we present an analysis of the behavior of the models trained under the traditional cropped image and the prominence maps. In this sense, we show that the performance of the saliency-based CNNs is better than the traditional training reaching a classification rate of 97% in validation dataset. It is considered that style identification with this technique can make a wide contribution in video description tasks, specifically in the automatic documentation of Mexican cultural heritage.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction\",\"authors\":\"A. M. Obeso, M. García-Vázquez, A. A. Ramírez-Acosta, J. Benois-Pineau\",\"doi\":\"10.1145/3095713.3095730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The automatic description of multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance for application to this method. Our problem is classification of architectural styles of buildings in digital photographs of Mexican cultural heritage. The selection of relevant content in the scene for training classification models allows them to be more precise in the classification task. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Convolutional Neural Network to identify the architectural style of Mexican buildings. Also, we present an analysis of the behavior of the models trained under the traditional cropped image and the prominence maps. In this sense, we show that the performance of the saliency-based CNNs is better than the traditional training reaching a classification rate of 97% in validation dataset. It is considered that style identification with this technique can make a wide contribution in video description tasks, specifically in the automatic documentation of Mexican cultural heritage.\",\"PeriodicalId\":310224,\"journal\":{\"name\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3095713.3095730\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3095713.3095730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction
The automatic description of multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance for application to this method. Our problem is classification of architectural styles of buildings in digital photographs of Mexican cultural heritage. The selection of relevant content in the scene for training classification models allows them to be more precise in the classification task. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Convolutional Neural Network to identify the architectural style of Mexican buildings. Also, we present an analysis of the behavior of the models trained under the traditional cropped image and the prominence maps. In this sense, we show that the performance of the saliency-based CNNs is better than the traditional training reaching a classification rate of 97% in validation dataset. It is considered that style identification with this technique can make a wide contribution in video description tasks, specifically in the automatic documentation of Mexican cultural heritage.