{"title":"Maximizing model generalization under feature and label shifts for structural damage detection via Bayesian theory","authors":"Xiaoyou Wang , Jinyang Jiao , Xiaoqing Zhou , Yong Xia","doi":"10.1016/j.ymssp.2024.112052","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning models face performance degradation when migrating to datasets with different distributions from training data. This challenge limits their applications because recollecting labeled data to retrain models is expensive and time-consuming. Domain generalization (DG) aims to learn a predictive model that can extract invariant features across source domains and then generalize to related but unseen target domains. However, most existing feature-invariant DG methods rely on unrealistic assumptions (e.g., stable feature distribution or no label shift) and simplify the problem to learn either invariant feature marginal or conditional distributions. In practice, both feature and label shifts may exist, rendering these assumptions invalid. This study develops a novel DG method to simultaneously consider conditional, marginal, and label shifts. With Bayes’ theorem, DG is interpreted as a posterior distribution alignment problem, which is derived by the likelihood function, evidence, and prior. The likelihood function and evidence correspond to feature conditional and marginal distributions, respectively, which are estimated by exploiting variational Bayesian inference. The conditional distributions across domains are aligned by minimizing the Kullback–Leibler divergence, and the marginal distributions across domains are aligned through moment minimization. The label prior shift is estimated by the label smoothing mechanism and class-wise prototype learning. Consequently, DG is achieved by aligning the label posterior distribution according to the Bayesian equation. Numerical and experimental examples demonstrate that the developed method outperforms state-of-the-art DG methods in damage detection of both mechanical and civil structures. The method can be extended to DG tasks in other fields.</div></div>","PeriodicalId":51124,"journal":{"name":"Mechanical Systems and Signal Processing","volume":"224 ","pages":"Article 112052"},"PeriodicalIF":7.9000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechanical Systems and Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888327024009506","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning models face performance degradation when migrating to datasets with different distributions from training data. This challenge limits their applications because recollecting labeled data to retrain models is expensive and time-consuming. Domain generalization (DG) aims to learn a predictive model that can extract invariant features across source domains and then generalize to related but unseen target domains. However, most existing feature-invariant DG methods rely on unrealistic assumptions (e.g., stable feature distribution or no label shift) and simplify the problem to learn either invariant feature marginal or conditional distributions. In practice, both feature and label shifts may exist, rendering these assumptions invalid. This study develops a novel DG method to simultaneously consider conditional, marginal, and label shifts. With Bayes’ theorem, DG is interpreted as a posterior distribution alignment problem, which is derived by the likelihood function, evidence, and prior. The likelihood function and evidence correspond to feature conditional and marginal distributions, respectively, which are estimated by exploiting variational Bayesian inference. The conditional distributions across domains are aligned by minimizing the Kullback–Leibler divergence, and the marginal distributions across domains are aligned through moment minimization. The label prior shift is estimated by the label smoothing mechanism and class-wise prototype learning. Consequently, DG is achieved by aligning the label posterior distribution according to the Bayesian equation. Numerical and experimental examples demonstrate that the developed method outperforms state-of-the-art DG methods in damage detection of both mechanical and civil structures. The method can be extended to DG tasks in other fields.
期刊介绍:
Journal Name: Mechanical Systems and Signal Processing (MSSP)
Interdisciplinary Focus:
Mechanical, Aerospace, and Civil Engineering
Purpose:Reporting scientific advancements of the highest quality
Arising from new techniques in sensing, instrumentation, signal processing, modelling, and control of dynamic systems