Adroit T.N. Fajar , Guillaume Lambard , Md. Amirul Islam , Bidyut B. Saha , Zakiah D. Nurfajrin , Kevin Septioga
{"title":"Generating eco-friendly ionic liquids with enhanced CO2 solubility using language models","authors":"Adroit T.N. Fajar , Guillaume Lambard , Md. Amirul Islam , Bidyut B. Saha , Zakiah D. Nurfajrin , Kevin Septioga","doi":"10.1016/j.aichem.2025.100089","DOIUrl":null,"url":null,"abstract":"<div><div>This study presents a viable approach for designing eco-friendly ionic liquids (ILs) with enhanced CO<sub>2</sub> solubility using language models, specifically GPT-2 in conjunction with SMILES-X. The GPT-2 model was fine-tuned on a relatively small, unlabeled IL dataset and subsequently used to generate diverse IL structures. SMILES-X models, trained on IL datasets labeled with CO<sub>2</sub> solubility and eco-toxicity values, were employed to predict the properties of the generated ILs. Trends observed in the predicted IL properties were validated using density functional theory (DFT) and COSMO-RS calculations. The GPT-2 model was then fine-tuned iteratively, with the training data updated by including the top generated ILs from previous cycles. This iterative process led to a gradual improvement in the properties of the generated ILs. It was also observed, however, that continuously adding curated generated ILs to the training data eventually caused the model to produce correct but unrealistic IL structures. These findings highlight both the potential and limitations of language models in designing novel chemicals. Additionally, the CO<sub>2</sub> adsorption capacity of a surrogate IL was experimentally measured, demonstrating the potential of this approach in advancing decarbonization technologies.</div></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"3 1","pages":"Article 100089"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747725000065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents a viable approach for designing eco-friendly ionic liquids (ILs) with enhanced CO2 solubility using language models, specifically GPT-2 in conjunction with SMILES-X. The GPT-2 model was fine-tuned on a relatively small, unlabeled IL dataset and subsequently used to generate diverse IL structures. SMILES-X models, trained on IL datasets labeled with CO2 solubility and eco-toxicity values, were employed to predict the properties of the generated ILs. Trends observed in the predicted IL properties were validated using density functional theory (DFT) and COSMO-RS calculations. The GPT-2 model was then fine-tuned iteratively, with the training data updated by including the top generated ILs from previous cycles. This iterative process led to a gradual improvement in the properties of the generated ILs. It was also observed, however, that continuously adding curated generated ILs to the training data eventually caused the model to produce correct but unrealistic IL structures. These findings highlight both the potential and limitations of language models in designing novel chemicals. Additionally, the CO2 adsorption capacity of a surrogate IL was experimentally measured, demonstrating the potential of this approach in advancing decarbonization technologies.