{"title":"Identification of stone deterioration patterns with large multimodal models. Definitions and benchmarking","authors":"Daniele Corradetti , José Delgado Rodrigues","doi":"10.1016/j.culher.2024.11.017","DOIUrl":null,"url":null,"abstract":"<div><div>The conservation of stone-based cultural heritage sites is a critical concern for preserving cultural and historical landmarks. With the advent of Large Multimodal Models, as GPT-4omni (OpenAI), Claude 3 Opus (Anthropic) and Gemini 1.5 Pro (Google), it is becoming increasingly important to define the operational capabilities of these models. In this work, we systematically evaluate the image classification capabilities of the main foundational multimodal models to recognise and categorize anomalies and deterioration patterns of stone elements that are useful in the practice of conservation and restoration of world heritage. After defining a taxonomy of the main stone deterioration patterns and anomalies, we asked the foundational models to identify a curated selection of 354 highly representative images of stone-built heritage, offering them a careful selection of labels to choose from. The result, which varies depending on the type of pattern, allowed us to identify the strengths and weaknesses of these models in the field of heritage conservation and restoration.</div></div>","PeriodicalId":15480,"journal":{"name":"Journal of Cultural Heritage","volume":"71 ","pages":"Pages 175-183"},"PeriodicalIF":3.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Heritage","FirstCategoryId":"103","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1296207424002486","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHAEOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The conservation of stone-based cultural heritage sites is a critical concern for preserving cultural and historical landmarks. With the advent of Large Multimodal Models, as GPT-4omni (OpenAI), Claude 3 Opus (Anthropic) and Gemini 1.5 Pro (Google), it is becoming increasingly important to define the operational capabilities of these models. In this work, we systematically evaluate the image classification capabilities of the main foundational multimodal models to recognise and categorize anomalies and deterioration patterns of stone elements that are useful in the practice of conservation and restoration of world heritage. After defining a taxonomy of the main stone deterioration patterns and anomalies, we asked the foundational models to identify a curated selection of 354 highly representative images of stone-built heritage, offering them a careful selection of labels to choose from. The result, which varies depending on the type of pattern, allowed us to identify the strengths and weaknesses of these models in the field of heritage conservation and restoration.
石质文化遗产遗址的保护是保护文化和历史地标的关键问题。随着大型多模态模型的出现,如GPT-4omni (OpenAI), Claude 3 Opus (Anthropic)和Gemini 1.5 Pro (b谷歌),定义这些模型的操作能力变得越来越重要。在这项工作中,我们系统地评估了主要基础多模态模型的图像分类能力,以识别和分类在世界遗产保护和修复实践中有用的石头元素的异常和退化模式。在定义了主要石头退化模式和异常的分类之后,我们要求基础模型识别出354张极具代表性的石头建筑遗产图像,为他们提供精心挑选的标签。结果根据模式的类型而有所不同,这使我们能够确定这些模式在遗产保护和修复领域的优势和劣势。
期刊介绍:
The Journal of Cultural Heritage publishes original papers which comprise previously unpublished data and present innovative methods concerning all aspects of science and technology of cultural heritage as well as interpretation and theoretical issues related to preservation.