Kareem S. Aggour, Vijay S. Kumar, Vipul K. Gupta, Alfredo Gabaldon, Paul Cuddihy, Varish Mulwad
{"title":"Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data","authors":"Kareem S. Aggour, Vijay S. Kumar, Vipul K. Gupta, Alfredo Gabaldon, Paul Cuddihy, Varish Mulwad","doi":"10.1007/s40192-024-00348-4","DOIUrl":null,"url":null,"abstract":"<p>The development and discovery of new materials can be significantly enhanced through the adoption of FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and the establishment of a robust data infrastructure in support of materials informatics. A FAIR data infrastructure and associated best practices empower materials scientists to access and make the most of a wealth of information on materials properties, structures, and behaviors, allowing them to collaborate effectively, and enable data-driven approaches to material discovery. To make data findable, accessible, interoperable, and reusable to materials scientists, we developed and are in the process of expanding a materials data infrastructure to capture, store, and link data to enable a variety of analytics and visualizations. Our infrastructure follows three key architectural design philosophies: (i) capture data across a federated storage layer to minimize the storage footprint and maximize the query performance for each data type, (ii) use a knowledge graph-based data fusion layer to provide a single logical interface above the federated data repositories, and (iii) provide an ensemble of FAIR data access and reuse services atop the knowledge graph to make it easy for materials scientists and other domain experts to explore, use, and derive value from the data. This paper details our architectural approach, open-source technologies used to build the capabilities and services, and describes two applications through which we have successfully demonstrated its use. In the first use case, we created a system to enable additive manufacturing data storage and process parameter optimization with a range of user-friendly visualizations. In the second use case, we created a system for exploring data from cathodic arc deposition experiments to develop a new steam turbine coating material, fusing a combination of materials data with physics-based equations to enable advanced reasoning over the combined knowledge using a natural language chatbot-like user interface.</p>","PeriodicalId":13604,"journal":{"name":"Integrating Materials and Manufacturing Innovation","volume":"2013 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integrating Materials and Manufacturing Innovation","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1007/s40192-024-00348-4","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MANUFACTURING","Score":null,"Total":0}
引用次数: 0
Abstract
The development and discovery of new materials can be significantly enhanced through the adoption of FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and the establishment of a robust data infrastructure in support of materials informatics. A FAIR data infrastructure and associated best practices empower materials scientists to access and make the most of a wealth of information on materials properties, structures, and behaviors, allowing them to collaborate effectively, and enable data-driven approaches to material discovery. To make data findable, accessible, interoperable, and reusable to materials scientists, we developed and are in the process of expanding a materials data infrastructure to capture, store, and link data to enable a variety of analytics and visualizations. Our infrastructure follows three key architectural design philosophies: (i) capture data across a federated storage layer to minimize the storage footprint and maximize the query performance for each data type, (ii) use a knowledge graph-based data fusion layer to provide a single logical interface above the federated data repositories, and (iii) provide an ensemble of FAIR data access and reuse services atop the knowledge graph to make it easy for materials scientists and other domain experts to explore, use, and derive value from the data. This paper details our architectural approach, open-source technologies used to build the capabilities and services, and describes two applications through which we have successfully demonstrated its use. In the first use case, we created a system to enable additive manufacturing data storage and process parameter optimization with a range of user-friendly visualizations. In the second use case, we created a system for exploring data from cathodic arc deposition experiments to develop a new steam turbine coating material, fusing a combination of materials data with physics-based equations to enable advanced reasoning over the combined knowledge using a natural language chatbot-like user interface.
期刊介绍:
The journal will publish: Research that supports building a model-based definition of materials and processes that is compatible with model-based engineering design processes and multidisciplinary design optimization; Descriptions of novel experimental or computational tools or data analysis techniques, and their application, that are to be used for ICME; Best practices in verification and validation of computational tools, sensitivity analysis, uncertainty quantification, and data management, as well as standards and protocols for software integration and exchange of data; In-depth descriptions of data, databases, and database tools; Detailed case studies on efforts, and their impact, that integrate experiment and computation to solve an enduring engineering problem in materials and manufacturing.