Vision transformers in domain adaptation and domain generalization: a study of robustness

Neural Computing and Applications Pub Date : 2024-08-22 DOI:10.1007/s00521-024-10353-5

Shadi Alijani, Jamil Fayyad, Homayoun Najjaran

{"title":"Vision transformers in domain adaptation and domain generalization: a study of robustness","authors":"Shadi Alijani, Jamil Fayyad, Homayoun Najjaran","doi":"10.1007/s00521-024-10353-5","DOIUrl":null,"url":null,"abstract":"<p>Deep learning models are often evaluated in scenarios where the data distribution is different from those used in the training and validation phases. The discrepancy presents a challenge for accurately predicting the performance of models once deployed on the target distribution. Domain adaptation and generalization are widely recognized as effective strategies for addressing such shifts, thereby ensuring reliable performance. The recent promising results in applying vision transformers in computer vision tasks, coupled with advancements in self-attention mechanisms, have demonstrated their significant potential for robustness and generalization in handling distribution shifts. Motivated by the increased interest from the research community, our paper investigates the deployment of vision transformers in domain adaptation and domain generalization scenarios. For domain adaptation methods, we categorize research into feature-level, instance-level, model-level adaptations, and hybrid approaches, along with other categorizations with respect to diverse strategies for enhancing domain adaptation. Similarly, for domain generalization, we categorize research into multi-domain learning, meta-learning, regularization techniques, and data augmentation strategies. We further classify diverse strategies in research, underscoring the various approaches researchers have taken to address distribution shifts by integrating vision transformers. The inclusion of comprehensive tables summarizing these categories is a distinct feature of our work, offering valuable insights for researchers. These findings highlight the versatility of vision transformers in managing distribution shifts, crucial for real-world applications, especially in critical safety and decision-making scenarios.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"1197 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computing and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00521-024-10353-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning models are often evaluated in scenarios where the data distribution is different from those used in the training and validation phases. The discrepancy presents a challenge for accurately predicting the performance of models once deployed on the target distribution. Domain adaptation and generalization are widely recognized as effective strategies for addressing such shifts, thereby ensuring reliable performance. The recent promising results in applying vision transformers in computer vision tasks, coupled with advancements in self-attention mechanisms, have demonstrated their significant potential for robustness and generalization in handling distribution shifts. Motivated by the increased interest from the research community, our paper investigates the deployment of vision transformers in domain adaptation and domain generalization scenarios. For domain adaptation methods, we categorize research into feature-level, instance-level, model-level adaptations, and hybrid approaches, along with other categorizations with respect to diverse strategies for enhancing domain adaptation. Similarly, for domain generalization, we categorize research into multi-domain learning, meta-learning, regularization techniques, and data augmentation strategies. We further classify diverse strategies in research, underscoring the various approaches researchers have taken to address distribution shifts by integrating vision transformers. The inclusion of comprehensive tables summarizing these categories is a distinct feature of our work, offering valuable insights for researchers. These findings highlight the versatility of vision transformers in managing distribution shifts, crucial for real-world applications, especially in critical safety and decision-making scenarios.

Abstract Image

查看原文本刊更多论文

领域适应和领域泛化中的视觉转换器：稳健性研究

深度学习模型经常在数据分布与训练和验证阶段所用数据分布不同的场景中进行评估。这种差异对准确预测模型在目标分布上部署后的性能提出了挑战。领域适应和泛化被广泛认为是解决这种差异的有效策略，从而确保可靠的性能。最近，在计算机视觉任务中应用视觉变换器取得了可喜的成果，再加上自我注意机制的进步，都证明了视觉变换器在处理分布偏移方面具有巨大的鲁棒性和泛化潜力。在研究界日益浓厚的兴趣的推动下，我们的论文研究了视觉变换器在领域适应和领域泛化场景中的应用。对于领域适应方法，我们将研究分为特征级适应、实例级适应、模型级适应和混合方法，并根据增强领域适应的不同策略进行了其他分类。同样，对于领域泛化，我们将研究分为多领域学习、元学习、正则化技术和数据增强策略。我们进一步对研究中的各种策略进行了分类，强调了研究人员通过整合视觉转换器来解决分布偏移问题的各种方法。我们的工作有一个显著特点，就是包含了总结这些类别的综合表格，为研究人员提供了宝贵的见解。这些发现凸显了视觉转换器在管理分布偏移方面的多功能性，这对现实世界的应用至关重要，尤其是在关键的安全和决策场景中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Computing and Applications

自引率

0.00%

发文量