Efficient estimation of material property curves and surfaces via active learning

arXiv: Materials Science Pub Date : 2020-10-14 DOI:10.1103/PHYSREVMATERIALS.5.013802

Yuan Tian, D. Xue, Ruihao Yuan, Yumei Zhou, Xiangdong Ding, Jun Sun, T. Lookman

{"title":"Efficient estimation of material property curves and surfaces via active learning","authors":"Yuan Tian, D. Xue, Ruihao Yuan, Yumei Zhou, Xiangdong Ding, Jun Sun, T. Lookman","doi":"10.1103/PHYSREVMATERIALS.5.013802","DOIUrl":null,"url":null,"abstract":"The relationship between material properties and independent variables such as temperature, external field or time, is usually represented by a curve or surface in a multi-dimensional space. Determining such a curve or surface requires a series of experiments or calculations which are often time and cost consuming. A general strategy uses an appropriate utility function to sample the space to recommend the next optimal experiment or calculation within an active learning loop. However, knowing what the optimal sampling strategy to use to minimize the number of experiments is an outstanding problem. We compare a number of strategies based on directed exploration on several materials problems of varying complexity using a Kriging based model. These include one dimensional curves such as the fatigue life curve for 304L stainless steel and the Liquidus line of the Fe-C phase diagram, surfaces such as the Hartmann 3 function in 3D space and the fitted intermolecular potential for Ar-SH, and a four dimensional data set of experimental measurements for BaTiO3 based ceramics. We also consider the effects of experimental noise on the Hartmann 3 function. We find that directed exploration guided by maximum variance provides better performance overall, converging faster across several data sets. However, for certain problems, the trade-off methods incorporating exploitation can perform at least as well, if not better than maximum variance. Thus, we discuss how the choice of the utility function depends on the distribution of the data, the model performance and uncertainties, additive noise as well as the budget.","PeriodicalId":8467,"journal":{"name":"arXiv: Materials Science","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1103/PHYSREVMATERIALS.5.013802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

The relationship between material properties and independent variables such as temperature, external field or time, is usually represented by a curve or surface in a multi-dimensional space. Determining such a curve or surface requires a series of experiments or calculations which are often time and cost consuming. A general strategy uses an appropriate utility function to sample the space to recommend the next optimal experiment or calculation within an active learning loop. However, knowing what the optimal sampling strategy to use to minimize the number of experiments is an outstanding problem. We compare a number of strategies based on directed exploration on several materials problems of varying complexity using a Kriging based model. These include one dimensional curves such as the fatigue life curve for 304L stainless steel and the Liquidus line of the Fe-C phase diagram, surfaces such as the Hartmann 3 function in 3D space and the fitted intermolecular potential for Ar-SH, and a four dimensional data set of experimental measurements for BaTiO3 based ceramics. We also consider the effects of experimental noise on the Hartmann 3 function. We find that directed exploration guided by maximum variance provides better performance overall, converging faster across several data sets. However, for certain problems, the trade-off methods incorporating exploitation can perform at least as well, if not better than maximum variance. Thus, we discuss how the choice of the utility function depends on the distribution of the data, the model performance and uncertainties, additive noise as well as the budget.

查看原文本刊更多论文

通过主动学习有效估计材料特性曲线和曲面

材料性能与温度、外场或时间等自变量之间的关系通常用多维空间中的曲线或曲面来表示。确定这样的曲线或曲面需要进行一系列的实验或计算，这通常既费时又费钱。一般策略使用适当的效用函数对空间进行采样，以在主动学习循环中推荐下一个最优实验或计算。然而，知道使用什么最佳采样策略来最小化实验数量是一个突出的问题。我们使用基于克里格模型的几种不同复杂性的材料问题，比较了几种基于定向探索的策略。其中包括一维曲线，如304L不锈钢的疲劳寿命曲线和Fe-C相图的液相线，曲面，如三维空间中的Hartmann 3函数和Ar-SH的拟合分子间势，以及基于BaTiO3的陶瓷的四维实验测量数据集。我们还考虑了实验噪声对哈特曼3函数的影响。我们发现，由最大方差指导的定向探索总体上提供了更好的性能，跨多个数据集的收敛速度更快。然而，对于某些问题，结合利用的权衡方法即使不比最大方差更好，至少也可以执行得很好。因此，我们讨论了效用函数的选择如何取决于数据的分布、模型性能和不确定性、附加噪声以及预算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv: Materials Science

自引率

0.00%

发文量