{"title":"Introducing MagBERT: A language model for magnesium textual data mining and analysis","authors":"","doi":"10.1016/j.jma.2024.08.010","DOIUrl":null,"url":null,"abstract":"<div><div>Magnesium (Mg) based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio. However, to fully harness the potential of Mg alloys, structured analytics are essential to gain valuable insights from centuries of accumulated knowledge. Efficient information extraction from the vast corpus of scientific literature is crucial for this purpose. In this work, we introduce MagBERT, a BERT-based language model specifically trained for Mg-based materials. Utilizing a dataset of approximately 370,000 abstracts focused on Mg and its alloys, MagBERT is designed to understand the intricate details and specialized terminology of this domain. Through rigorous evaluation, we demonstrate the effectiveness of MagBERT for information extraction using a fine-tuned named entity recognition (NER) model, named MagNER. This NER model can extract mechanical, microstructural, and processing properties related to Mg alloys. For instance, we have created an Mg alloy dataset that includes properties such as ductility, yield strength, and ultimate tensile strength (UTS), along with standard alloy names. The introduction of MagBERT is a novel advancement in the development of Mg-specific language models, marking a significant milestone in the discovery of Mg alloys and textual information extraction. By making the pre-trained weights of MagBERT publicly accessible, we aim to accelerate research and innovation in the field of Mg-based materials through efficient information extraction and knowledge discovery.</div></div>","PeriodicalId":16214,"journal":{"name":"Journal of Magnesium and Alloys","volume":null,"pages":null},"PeriodicalIF":15.8000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2213956724002858/pdfft?md5=4ee21cf30f49fd3c4009f608f10de4b9&pid=1-s2.0-S2213956724002858-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Magnesium and Alloys","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213956724002858","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"METALLURGY & METALLURGICAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Magnesium (Mg) based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio. However, to fully harness the potential of Mg alloys, structured analytics are essential to gain valuable insights from centuries of accumulated knowledge. Efficient information extraction from the vast corpus of scientific literature is crucial for this purpose. In this work, we introduce MagBERT, a BERT-based language model specifically trained for Mg-based materials. Utilizing a dataset of approximately 370,000 abstracts focused on Mg and its alloys, MagBERT is designed to understand the intricate details and specialized terminology of this domain. Through rigorous evaluation, we demonstrate the effectiveness of MagBERT for information extraction using a fine-tuned named entity recognition (NER) model, named MagNER. This NER model can extract mechanical, microstructural, and processing properties related to Mg alloys. For instance, we have created an Mg alloy dataset that includes properties such as ductility, yield strength, and ultimate tensile strength (UTS), along with standard alloy names. The introduction of MagBERT is a novel advancement in the development of Mg-specific language models, marking a significant milestone in the discovery of Mg alloys and textual information extraction. By making the pre-trained weights of MagBERT publicly accessible, we aim to accelerate research and innovation in the field of Mg-based materials through efficient information extraction and knowledge discovery.
期刊介绍:
The Journal of Magnesium and Alloys serves as a global platform for both theoretical and experimental studies in magnesium science and engineering. It welcomes submissions investigating various scientific and engineering factors impacting the metallurgy, processing, microstructure, properties, and applications of magnesium and alloys. The journal covers all aspects of magnesium and alloy research, including raw materials, alloy casting, extrusion and deformation, corrosion and surface treatment, joining and machining, simulation and modeling, microstructure evolution and mechanical properties, new alloy development, magnesium-based composites, bio-materials and energy materials, applications, and recycling.