Human object interaction detection in paintings using multi-task learning

Q1 Social Sciences
Maya Antoun, Daniel Asmar
{"title":"Human object interaction detection in paintings using multi-task learning","authors":"Maya Antoun,&nbsp;Daniel Asmar","doi":"10.1016/j.daach.2024.e00364","DOIUrl":null,"url":null,"abstract":"<div><p>Human Object Interaction (HOI) detection can provide valuable insights into the meaning and interpretation of a painting, as the interactions between humans and objects can reveal information about the scene, characters, and story depicted in the artwork. Automatically detecting HOI in paintings is a challenging task, as the paintings often contain complex scenes with intricate details and variations in artistic style. Additionally, unlike in real-world images, the context and physics of the painting may not follow physical rules, which can further complicate the detection process. This paper proposes a novel system for detecting HOIs in paintings using multi-task learning. The system utilizes an object detection model to detect instances of human figures and objects, and extracts from them visual and spatial features. The appearance features are then combined to produce an optimized model for detecting HOIs. In order to enhance our model's performance on HOI detection, we train it in a multi-task learning setting with four different tasks. This approach allows us to leverage shared representations across multiple tasks, leading to improved accuracy and efficiency of HOI detection in our system. To train and test our model, we introduce a new benchmark for HOI detection in paintings, by augmenting the existing SemArt dataset with instance detection annotations and interaction classes and call it SemArt-HOI. Through our experiments, we show that our model is able to outperform the state-of-the-art one-stage transformer-based HOI detection model in both single-task and multi-task settings. Furthermore, our system's superior efficiency, training four times faster than the state-of-the-art model and using fewer resources, makes it ideal for practical and large-scale HOI detection in paintings.</p></div>","PeriodicalId":38225,"journal":{"name":"Digital Applications in Archaeology and Cultural Heritage","volume":"34 ","pages":"Article e00364"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Applications in Archaeology and Cultural Heritage","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212054824000493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Human Object Interaction (HOI) detection can provide valuable insights into the meaning and interpretation of a painting, as the interactions between humans and objects can reveal information about the scene, characters, and story depicted in the artwork. Automatically detecting HOI in paintings is a challenging task, as the paintings often contain complex scenes with intricate details and variations in artistic style. Additionally, unlike in real-world images, the context and physics of the painting may not follow physical rules, which can further complicate the detection process. This paper proposes a novel system for detecting HOIs in paintings using multi-task learning. The system utilizes an object detection model to detect instances of human figures and objects, and extracts from them visual and spatial features. The appearance features are then combined to produce an optimized model for detecting HOIs. In order to enhance our model's performance on HOI detection, we train it in a multi-task learning setting with four different tasks. This approach allows us to leverage shared representations across multiple tasks, leading to improved accuracy and efficiency of HOI detection in our system. To train and test our model, we introduce a new benchmark for HOI detection in paintings, by augmenting the existing SemArt dataset with instance detection annotations and interaction classes and call it SemArt-HOI. Through our experiments, we show that our model is able to outperform the state-of-the-art one-stage transformer-based HOI detection model in both single-task and multi-task settings. Furthermore, our system's superior efficiency, training four times faster than the state-of-the-art model and using fewer resources, makes it ideal for practical and large-scale HOI detection in paintings.

利用多任务学习检测绘画作品中的人机交互
人与物体之间的互动可以揭示艺术作品中所描绘的场景、人物和故事的相关信息,因此人与物体互动(HOI)检测可以为绘画作品的意义和诠释提供有价值的见解。自动检测绘画作品中的 HOI 是一项极具挑战性的任务,因为绘画作品通常包含复杂的场景、错综复杂的细节以及不同的艺术风格。此外,与真实世界的图像不同,绘画的背景和物理特性可能并不遵循物理规则,这可能会使检测过程更加复杂。本文提出了一种利用多任务学习检测绘画中 HOIs 的新型系统。该系统利用物体检测模型来检测人物和物体的实例,并从中提取视觉和空间特征。然后将这些外观特征结合起来,生成一个用于检测 HOI 的优化模型。为了提高模型在 HOI 检测方面的性能,我们在多任务学习设置中使用四种不同的任务对其进行训练。通过这种方法,我们可以在多个任务中利用共享表征,从而提高系统中 HOI 检测的准确性和效率。为了训练和测试我们的模型,我们在现有的 SemArt 数据集中添加了实例检测注释和交互类,并将其称为 SemArt-HOI,从而为绘画中的 HOI 检测引入了一个新的基准。通过实验,我们发现无论是在单任务还是多任务设置中,我们的模型都能超越最先进的基于单级变换器的 HOI 检测模型。此外,我们的系统效率出众,训练速度是最先进模型的四倍,使用的资源也更少,因此非常适合在绘画中进行实用的大规模 HOI 检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.40
自引率
0.00%
发文量
33
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信