{"title":"Performance and accuracy assessments of an incompressible fluid solver coupled with a deep convolutional neural network—ADDENDUM","authors":"Ekhi Ajuria Illarramendi, M. Bauerheim, B. Cuenot","doi":"10.1017/dce.2022.10","DOIUrl":"https://doi.org/10.1017/dce.2022.10","url":null,"abstract":"","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42904074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Oliveira, R. Scalzo, R. Kohn, Sally Cripps, Kyle Hardman, J. Close, Nasrin Taghavi, C. Lemckert
{"title":"Bayesian optimization with informative parametric models via sequential Monte Carlo","authors":"Rafael Oliveira, R. Scalzo, R. Kohn, Sally Cripps, Kyle Hardman, J. Close, Nasrin Taghavi, C. Lemckert","doi":"10.1017/dce.2022.5","DOIUrl":"https://doi.org/10.1017/dce.2022.5","url":null,"abstract":"Abstract Bayesian optimization (BO) has been a successful approach to optimize expensive functions whose prior knowledge can be specified by means of a probabilistic model. Due to their expressiveness and tractable closed-form predictive distributions, Gaussian process (GP) surrogate models have been the default go-to choice when deriving BO frameworks. However, as nonparametric models, GPs offer very little in terms of interpretability and informative power when applied to model complex physical phenomena in scientific applications. In addition, the Gaussian assumption also limits the applicability of GPs to problems where the variables of interest may highly deviate from Gaussianity. In this article, we investigate an alternative modeling framework for BO which makes use of sequential Monte Carlo (SMC) to perform Bayesian inference with parametric models. We propose a BO algorithm to take advantage of SMC’s flexible posterior representations and provide methods to compensate for bias in the approximations and reduce particle degeneracy. Experimental results on simulated engineering applications in detecting water leaks and contaminant source localization are presented showing performance improvements over GP-based BO approaches.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42701494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Markov chain Monte Carlo for a hyperbolic Bayesian inverse problem in traffic flow modeling","authors":"Jeremie Coullon, Y. Pokern","doi":"10.1017/dce.2022.3","DOIUrl":"https://doi.org/10.1017/dce.2022.3","url":null,"abstract":"Abstract As a Bayesian approach to fitting motorway traffic flow models remains rare in the literature, we empirically explore the sampling challenges this approach offers which have to do with the strong correlations and multimodality of the posterior distribution. In particular, we provide a unified statistical model to estimate using motorway data both boundary conditions and fundamental diagram parameters in a motorway traffic flow model due to Lighthill, Whitham, and Richards known as LWR. This allows us to provide a traffic flow density estimation method that is shown to be superior to two methods found in the traffic flow literature. To sample from this challenging posterior distribution, we use a state-of-the-art gradient-free function space sampler augmented with parallel tempering.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48769974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Akroyd, Zachary S. Harper, David Soutar, Feroz Farazi, Amita Bhave, S. Mosbach, M. Kraft
{"title":"Universal Digital Twin: Land use","authors":"J. Akroyd, Zachary S. Harper, David Soutar, Feroz Farazi, Amita Bhave, S. Mosbach, M. Kraft","doi":"10.1017/dce.2021.21","DOIUrl":"https://doi.org/10.1017/dce.2021.21","url":null,"abstract":"Abstract This article develops an ontological description of land use and applies it to incorporate geospatial information describing land coverage into a knowledge-graph-based Universal Digital Twin. Sources of data relating to land use in the UK have been surveyed. The Crop Map of England (CROME) is produced annually by the UK Government and was identified as a valuable source of open data. Formal ontologies to represent land use and the geospatial data arising from such surveys have been developed. The ontologies have been deployed using a high-performance graph database. A customized vocabulary was developed to extend the geospatial capabilities of the graph database to support the CROME data. The integration of the CROME data into the Universal Digital Twin is demonstrated in two use cases that show the potential of the Universal Digital Twin to share data across sectors. The first use case combines data about land use with a geospatial analysis of scenarios for energy provision. The second illustrates how the Universal Digital Twin could use the land use data to support the cross-domain analysis of flood risk. Opportunities for the extension and enrichment of the ontologies, and further development of the Universal Digital Twin are discussed.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46117462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Bonney, M. de Angelis, M. Dal Borgo, Luis Andrade, S. Beregi, N. Jamia, D. Wagg
{"title":"Development of a digital twin operational platform using Python Flask","authors":"M. Bonney, M. de Angelis, M. Dal Borgo, Luis Andrade, S. Beregi, N. Jamia, D. Wagg","doi":"10.1017/dce.2022.1","DOIUrl":"https://doi.org/10.1017/dce.2022.1","url":null,"abstract":"Abstract The digital twin concept has developed as a method for extracting value from data, and is being developed as a new technique for the design and asset management of high-value engineering systems such as aircraft, energy generating plant, and wind turbines. In terms of implementation, many proprietary digital twin software solutions have been marketed in this domain. In contrast, this paper describes a recently released open-source software framework for digital twins, which provides a browser-based operational platform using Python and Flask. The new platform is intended to maximize connectivity between users and data obtained from the physical twin. This paper describes how this type of digital twin operational platform (DTOP) can be used to connect the physical twin and other Internet-of-Things devices to both users and cloud computing services. The current release of the software—DTOP-Cristallo—uses the example of a three-storey structure as the engineering asset to be managed. Within DTOP-Cristallo, specific engineering software tools have been developed for use in the digital twin, and these are used to demonstrate the concept. At this stage, the framework presented is a prototype. However, the potential for open-source digital twin software using network connectivity is a very large area for future research and development.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42539430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Hassanaly, Bruce A. Perry, M. Mueller, S. Yellapantula
{"title":"Uniform-in-phase-space data selection with iterative normalizing flows","authors":"M. Hassanaly, Bruce A. Perry, M. Mueller, S. Yellapantula","doi":"10.1017/dce.2023.4","DOIUrl":"https://doi.org/10.1017/dce.2023.4","url":null,"abstract":"Abstract Improvements in computational and experimental capabilities are rapidly increasing the amount of scientific data that are routinely generated. In applications that are constrained by memory and computational intensity, excessively large datasets may hinder scientific discovery, making data reduction a critical component of data-driven methods. Datasets are growing in two directions: the number of data points and their dimensionality. Whereas dimension reduction typically aims at describing each data sample on lower-dimensional space, the focus here is on reducing the number of data points. A strategy is proposed to select data points such that they uniformly span the phase-space of the data. The algorithm proposed relies on estimating the probability map of the data and using it to construct an acceptance probability. An iterative method is used to accurately estimate the probability of the rare data points when only a small subset of the dataset is used to construct the probability map. Instead of binning the phase-space to estimate the probability map, its functional form is approximated with a normalizing flow. Therefore, the method naturally extends to high-dimensional datasets. The proposed framework is demonstrated as a viable pathway to enable data-efficient machine learning when abundant data are available.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44703688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hierarchical Bayesian approach for calibration of stochastic material models","authors":"Nikolaos Papadimas, T. Dodwell","doi":"10.1017/dce.2021.20","DOIUrl":"https://doi.org/10.1017/dce.2021.20","url":null,"abstract":"Abstract This article recasts the traditional challenge of calibrating a material constitutive model into a hierarchical probabilistic framework. We consider a Bayesian framework where material parameters are assigned distributions, which are then updated given experimental data. Importantly, in true engineering setting, we are not interested in inferring the parameters for a single experiment, but rather inferring the model parameters over the population of possible experimental samples. In doing so, we seek to also capture the inherent variability of the material from coupon-to-coupon, as well as uncertainties around the repeatability of the test. In this article, we address this problem using a hierarchical Bayesian model. However, a vanilla computational approach is prohibitively expensive. Our strategy marginalizes over each individual experiment, decreasing the dimension of our inference problem to only the hyperparameter—those parameter describing the population statistics of the material model only. Importantly, this marginalization step, requires us to derive an approximate likelihood, for which, we exploit an emulator (built offline prior to sampling) and Bayesian quadrature, allowing us to capture the uncertainty in this numerical approximation. Importantly, our approach renders hierarchical Bayesian calibration of material models computational feasible. The approach is tested in two different examples. The first is a compression test of simple spring model using synthetic data; the second, a more complex example using real experiment data to fit a stochastic elastoplastic model for 3D-printed steel.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41301936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Materials informatics and sustainability—The case for urgency","authors":"H. Melia, Eric S. Muckley, J. Saal","doi":"10.1017/dce.2021.19","DOIUrl":"https://doi.org/10.1017/dce.2021.19","url":null,"abstract":"Abstract The development of transformative technologies for mitigating our global environmental and technological challenges will require significant innovation in the design, development, and manufacturing of advanced materials and chemicals. To achieve this innovation faster than what is possible by traditional human intuition-guided scientific methods, we must transition to a materials informatics-centered paradigm, in which synergies between data science, materials science, and artificial intelligence are leveraged to enable transformative, data-driven discoveries faster than ever before through the use of predictive models and digital twins. While materials informatics is experiencing rapidly increasing use across the materials and chemicals industries, broad adoption is hindered by barriers such as skill gaps, cultural resistance, and data sparsity. We discuss the importance of materials informatics for accelerating technological innovation, describe current barriers and examples of good practices, and offer suggestions for how researchers, funding agencies, and educational institutions can help accelerate the adoption of urgently needed informatics-based toolsets for science in the 21st century.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47415973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiphase flow applications of nonintrusive reduced-order models with Gaussian process emulation","authors":"T. Botsas, Indranil Pan, L. Mason, O. Matar","doi":"10.1017/dce.2022.19","DOIUrl":"https://doi.org/10.1017/dce.2022.19","url":null,"abstract":"Abstract Reduced-order models (ROMs) are computationally inexpensive simplifications of high-fidelity complex ones. Such models can be found in computational fluid dynamics where they can be used to predict the characteristics of multiphase flows. In previous work, we presented a ROM analysis framework that coupled compression techniques, such as autoencoders, with Gaussian process regression in the latent space. This pairing has significant advantages over the standard encoding–decoding routine, such as the ability to interpolate or extrapolate in the initial conditions’ space, which can provide predictions even when simulation data are not available. In this work, we focus on this major advantage and show its effectiveness by performing the pipeline on three multiphase flow applications. We also extend the methodology by using deep Gaussian processes as the interpolation algorithm and compare the performance of our two variations, as well as another variation from the literature that uses long short-term memory networks, for the interpolation.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44487394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Di Francesco, M. Chryssanthopoulos, M. Faber, U. Bharadwaj
{"title":"Decision-theoretic inspection planning using imperfect and incomplete data","authors":"D. Di Francesco, M. Chryssanthopoulos, M. Faber, U. Bharadwaj","doi":"10.1017/dce.2021.18","DOIUrl":"https://doi.org/10.1017/dce.2021.18","url":null,"abstract":"Abstract Attempts to formalize inspection and monitoring strategies in industry have struggled to combine evidence from multiple sources (including subject matter expertise) in a mathematically coherent way. The perceived requirement for large amounts of data are often cited as the reason that quantitative risk-based inspection is incompatible with the sparse and imperfect information that is typically available to structural integrity engineers. Current industrial guidance is also limited in its methods of distinguishing quality of inspections, as this is typically based on simplified (qualitative) heuristics. In this paper, Bayesian multi-level (partial pooling) models are proposed as a flexible and transparent method of combining imperfect and incomplete information, to support decision-making regarding the integrity management of in-service structures. This work builds on the established theoretical framework for computing the expected value of information, by allowing for partial pooling between inspection measurements (or groups of measurements). This method is demonstrated for a simulated example of a structure with active corrosion in multiple locations, which acknowledges that the data will be associated with some precision, bias, and reliability. Quantifying the extent to which an inspection of one location can reduce uncertainty in damage models at remote locations has been shown to influence many aspects of the expected value of an inspection. These results are considered in the context of the current challenges in risk based structural integrity management.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42183233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}