Information-theoretic and Bayesian model selection for physics-based modeling: Balancing fit, complexity, and generalization

IF 6.8 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS
Xinyue Xu , Julian Wang
{"title":"Information-theoretic and Bayesian model selection for physics-based modeling: Balancing fit, complexity, and generalization","authors":"Xinyue Xu ,&nbsp;Julian Wang","doi":"10.1016/j.ins.2025.122743","DOIUrl":null,"url":null,"abstract":"<div><div>Reliable model selection is a cornerstone of developing physics-based models of engineering systems. However, existing model selection criteria has not been investigated across a variety of calibration scenarios, where selection choices can be affected by (i) parameter dimensionality, (ii) model form, (iii) prior informativeness, (iv) reparameterization, and (v) data characteristics. Moreover, it remains unclear whether these criteria can reliably distinguish model fidelity that genuinely improves explanatory power. These limitations restrict the broader applicability of model selection criteria in physics-based modeling, where balancing goodness-of-fit, complexity, and generalization is critical. To address these gaps, this study systematically evaluates information-theoretic and Bayesian model selection criteria through two case studies. The first case study employs polynomial regression models to isolate the effects of calibration factors and investigate their influence on the selection behavior of criteria. The second case study extends the analysis to a hierarchy of thermal models for double-pane windows, examining the ability of selection criteria to differentiate effective complexity from superficial increases in model fidelity. Results indicate that classical information-theoretic criteria are sensitive to parameter dimensionality, while covariance-based criteria reflect changes in model form and data characteristics, and Bayesian criteria exhibit sensitivity to all examined calibration factors. Furthermore, both covariance-based and Bayesian criteria effectively identify secondary physical mechanisms as sources of ineffective complexity, penalizing redundant fidelity. These findings underscore that model selection is not a one-size-fits-all task, and the choice of model selection criteria should be informed by the calibration scenario and the modeling objective.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"726 ","pages":"Article 122743"},"PeriodicalIF":6.8000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525008795","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Reliable model selection is a cornerstone of developing physics-based models of engineering systems. However, existing model selection criteria has not been investigated across a variety of calibration scenarios, where selection choices can be affected by (i) parameter dimensionality, (ii) model form, (iii) prior informativeness, (iv) reparameterization, and (v) data characteristics. Moreover, it remains unclear whether these criteria can reliably distinguish model fidelity that genuinely improves explanatory power. These limitations restrict the broader applicability of model selection criteria in physics-based modeling, where balancing goodness-of-fit, complexity, and generalization is critical. To address these gaps, this study systematically evaluates information-theoretic and Bayesian model selection criteria through two case studies. The first case study employs polynomial regression models to isolate the effects of calibration factors and investigate their influence on the selection behavior of criteria. The second case study extends the analysis to a hierarchy of thermal models for double-pane windows, examining the ability of selection criteria to differentiate effective complexity from superficial increases in model fidelity. Results indicate that classical information-theoretic criteria are sensitive to parameter dimensionality, while covariance-based criteria reflect changes in model form and data characteristics, and Bayesian criteria exhibit sensitivity to all examined calibration factors. Furthermore, both covariance-based and Bayesian criteria effectively identify secondary physical mechanisms as sources of ineffective complexity, penalizing redundant fidelity. These findings underscore that model selection is not a one-size-fits-all task, and the choice of model selection criteria should be informed by the calibration scenario and the modeling objective.
基于物理建模的信息理论和贝叶斯模型选择:平衡拟合,复杂性和泛化
可靠的模型选择是开发基于物理的工程系统模型的基石。然而,现有的模型选择标准尚未在各种校准场景中进行研究,其中选择选择可能受到(i)参数维数,(ii)模型形式,(iii)先验信息,(iv)重新参数化和(v)数据特征的影响。此外,尚不清楚这些标准是否能够可靠地区分真正提高解释力的模型保真度。这些限制限制了模型选择标准在基于物理的建模中的广泛适用性,在这种建模中,平衡拟合优度、复杂性和泛化是至关重要的。为了解决这些差距,本研究通过两个案例系统地评估了信息理论和贝叶斯模型的选择标准。第一个案例研究采用多项式回归模型分离校准因素的影响,并研究它们对标准选择行为的影响。第二个案例研究将分析扩展到双层玻璃窗的热模型层次,检查选择标准区分有效复杂性和模型保真度表面增加的能力。结果表明,经典信息论准则对参数维度敏感,协方差准则反映模型形式和数据特征的变化,贝叶斯准则对所有检验的校准因子都敏感。此外,基于协方差的标准和贝叶斯标准都有效地将次要物理机制识别为无效复杂性的来源,从而惩罚冗余保真度。这些发现强调了模型选择不是一个放之四海而皆准的任务,模型选择标准的选择应该由校准场景和建模目标来决定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信