André Luiz Carvalho Ottoni , Lara Toledo Cordeiro Ottoni
{"title":"A deep learning approach for cultural heritage building classification using transfer learning and data augmentation","authors":"André Luiz Carvalho Ottoni , Lara Toledo Cordeiro Ottoni","doi":"10.1016/j.culher.2025.06.010","DOIUrl":null,"url":null,"abstract":"<div><div>The detection of architectural components in historic buildings is essential for digital documentation and the conservation process of cultural heritage. In this regard, recent studies have explored artificial intelligence with computer vision to enhance the detection of key components in monuments. However, this field of research still lacks investigation into the influence of using transfer learning and data augmentation to improve the performance of machine learning models. Moreover, the literature still requires research on Artificial Intelligence applied to Brazilian colonial architecture. Thus, this study proposes a new deep learning approach for cultural heritage building classification using transfer learning and data augmentation. For this purpose, the ImageMG dataset is proposed, containing 6449 images of 94 historic buildings from the state of Minas Gerais (Brazil), categorized into five classes: fronton, church, door, window, and tower. Additionally, the influence of using transfer learning to enhance the classification results of the Mobilenet architecture in the task of detecting components of historic buildings is evaluated. The proposed approach also investigates the effects of 64 combinations of data augmentation, utilizing six geometric transformations (zoom, width shift range, height shift range, vertical flip, horizontal flip, and rotation) for generating synthetic images to train the deep learning models. The results showed that the optimization of transfer learning in conjunction with data augmentation demonstrated significant advances in the performance of cultural heritage building classification. Experiments with the ImageMG dataset using transfer learning and vertical flip achieved the best accuracy results in validation (92.37 %), test 1 (90.22 %), and test 2 (87.33 %).</div></div>","PeriodicalId":15480,"journal":{"name":"Journal of Cultural Heritage","volume":"74 ","pages":"Pages 214-224"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Heritage","FirstCategoryId":"103","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1296207425001207","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHAEOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The detection of architectural components in historic buildings is essential for digital documentation and the conservation process of cultural heritage. In this regard, recent studies have explored artificial intelligence with computer vision to enhance the detection of key components in monuments. However, this field of research still lacks investigation into the influence of using transfer learning and data augmentation to improve the performance of machine learning models. Moreover, the literature still requires research on Artificial Intelligence applied to Brazilian colonial architecture. Thus, this study proposes a new deep learning approach for cultural heritage building classification using transfer learning and data augmentation. For this purpose, the ImageMG dataset is proposed, containing 6449 images of 94 historic buildings from the state of Minas Gerais (Brazil), categorized into five classes: fronton, church, door, window, and tower. Additionally, the influence of using transfer learning to enhance the classification results of the Mobilenet architecture in the task of detecting components of historic buildings is evaluated. The proposed approach also investigates the effects of 64 combinations of data augmentation, utilizing six geometric transformations (zoom, width shift range, height shift range, vertical flip, horizontal flip, and rotation) for generating synthetic images to train the deep learning models. The results showed that the optimization of transfer learning in conjunction with data augmentation demonstrated significant advances in the performance of cultural heritage building classification. Experiments with the ImageMG dataset using transfer learning and vertical flip achieved the best accuracy results in validation (92.37 %), test 1 (90.22 %), and test 2 (87.33 %).
期刊介绍:
The Journal of Cultural Heritage publishes original papers which comprise previously unpublished data and present innovative methods concerning all aspects of science and technology of cultural heritage as well as interpretation and theoretical issues related to preservation.