{"title":"COEFF-KANs:用 KAN 解决电解质场问题的范例","authors":"Xinhe Li, Zhuoying Feng, Yezeng Chen, Weichen Dai, Zixu He, Yi Zhou, Shuhong Jiao","doi":"arxiv-2407.20265","DOIUrl":null,"url":null,"abstract":"To reduce the experimental validation workload for chemical researchers and\naccelerate the design and optimization of high-energy-density lithium metal\nbatteries, we aim to leverage models to automatically predict Coulombic\nEfficiency (CE) based on the composition of liquid electrolytes. There are\nmainly two representative paradigms in existing methods: machine learning and\ndeep learning. However, the former requires intelligent input feature selection\nand reliable computational methods, leading to error propagation from feature\nestimation to model prediction, while the latter (e.g. MultiModal-MoLFormer)\nfaces challenges of poor predictive performance and overfitting due to limited\ndiversity in augmented data. To tackle these issues, we propose a novel method\nCOEFF (COlumbic EFficiency prediction via Fine-tuned models), which consists of\ntwo stages: pre-training a chemical general model and fine-tuning on downstream\ndomain data. Firstly, we adopt the publicly available MoLFormer model to obtain\nfeature vectors for each solvent and salt in the electrolyte. Then, we perform\na weighted average of embeddings for each token across all molecules, with\nweights determined by the respective electrolyte component ratios. Finally, we\ninput the obtained electrolyte features into a Multi-layer Perceptron or\nKolmogorov-Arnold Network to predict CE. Experimental results on a real-world\ndataset demonstrate that our method achieves SOTA for predicting CE compared to\nall baselines. Data and code used in this work will be made publicly available\nafter the paper is published.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COEFF-KANs: A Paradigm to Address the Electrolyte Field with KANs\",\"authors\":\"Xinhe Li, Zhuoying Feng, Yezeng Chen, Weichen Dai, Zixu He, Yi Zhou, Shuhong Jiao\",\"doi\":\"arxiv-2407.20265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To reduce the experimental validation workload for chemical researchers and\\naccelerate the design and optimization of high-energy-density lithium metal\\nbatteries, we aim to leverage models to automatically predict Coulombic\\nEfficiency (CE) based on the composition of liquid electrolytes. There are\\nmainly two representative paradigms in existing methods: machine learning and\\ndeep learning. However, the former requires intelligent input feature selection\\nand reliable computational methods, leading to error propagation from feature\\nestimation to model prediction, while the latter (e.g. MultiModal-MoLFormer)\\nfaces challenges of poor predictive performance and overfitting due to limited\\ndiversity in augmented data. To tackle these issues, we propose a novel method\\nCOEFF (COlumbic EFficiency prediction via Fine-tuned models), which consists of\\ntwo stages: pre-training a chemical general model and fine-tuning on downstream\\ndomain data. Firstly, we adopt the publicly available MoLFormer model to obtain\\nfeature vectors for each solvent and salt in the electrolyte. Then, we perform\\na weighted average of embeddings for each token across all molecules, with\\nweights determined by the respective electrolyte component ratios. Finally, we\\ninput the obtained electrolyte features into a Multi-layer Perceptron or\\nKolmogorov-Arnold Network to predict CE. Experimental results on a real-world\\ndataset demonstrate that our method achieves SOTA for predicting CE compared to\\nall baselines. Data and code used in this work will be made publicly available\\nafter the paper is published.\",\"PeriodicalId\":501309,\"journal\":{\"name\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.20265\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.20265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
为了减少化学研究人员的实验验证工作量,加快高能量密度锂金属电池的设计和优化,我们旨在利用模型来自动预测基于液态电解质组成的库仑效率(CE)。现有方法中主要有两种代表性范式:机器学习和深度学习。然而,前者需要智能的输入特征选择和可靠的计算方法,从而导致从特征估计到模型预测的误差传播,而后者(如多模态-MoLFormer)由于增强数据的多样性有限,面临着预测性能差和过度拟合的挑战。为了解决这些问题,我们提出了一种新方法 COEFF(通过微调模型进行的铅效率预测),它包括两个阶段:预训练化学通用模型和在下游领域数据上进行微调。首先,我们采用公开的 MoLFormer 模型来获取电解质中每种溶剂和盐的特征向量。然后,我们对所有分子中每个标记的嵌入进行加权平均,权重由各自的电解质成分比决定。最后,我们将获得的电解质特征输入多层感知器或科尔莫哥罗夫-阿诺德网络,以预测 CE。在实际数据集上的实验结果表明,与所有基线方法相比,我们的方法在预测 CE 方面达到了 SOTA。这项工作中使用的数据和代码将在论文发表后公开。
COEFF-KANs: A Paradigm to Address the Electrolyte Field with KANs
To reduce the experimental validation workload for chemical researchers and
accelerate the design and optimization of high-energy-density lithium metal
batteries, we aim to leverage models to automatically predict Coulombic
Efficiency (CE) based on the composition of liquid electrolytes. There are
mainly two representative paradigms in existing methods: machine learning and
deep learning. However, the former requires intelligent input feature selection
and reliable computational methods, leading to error propagation from feature
estimation to model prediction, while the latter (e.g. MultiModal-MoLFormer)
faces challenges of poor predictive performance and overfitting due to limited
diversity in augmented data. To tackle these issues, we propose a novel method
COEFF (COlumbic EFficiency prediction via Fine-tuned models), which consists of
two stages: pre-training a chemical general model and fine-tuning on downstream
domain data. Firstly, we adopt the publicly available MoLFormer model to obtain
feature vectors for each solvent and salt in the electrolyte. Then, we perform
a weighted average of embeddings for each token across all molecules, with
weights determined by the respective electrolyte component ratios. Finally, we
input the obtained electrolyte features into a Multi-layer Perceptron or
Kolmogorov-Arnold Network to predict CE. Experimental results on a real-world
dataset demonstrate that our method achieves SOTA for predicting CE compared to
all baselines. Data and code used in this work will be made publicly available
after the paper is published.