基于多元数据融合和KNN-Attention-LSTM-XGBoost模型的大曲酸度时间序列预测分析研究。

IF 3.5 3区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu
{"title":"基于多元数据融合和KNN-Attention-LSTM-XGBoost模型的大曲酸度时间序列预测分析研究。","authors":"Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu","doi":"10.1007/s00449-025-03187-5","DOIUrl":null,"url":null,"abstract":"<p><p>Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R<sup>2</sup> values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.</p>","PeriodicalId":9024,"journal":{"name":"Bioprocess and Biosystems Engineering","volume":" ","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A study on time-series prediction and analysis of acidity of Daqu based on multivariate data fusion and KNN-Attention-LSTM-XGBoost modeling.\",\"authors\":\"Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu\",\"doi\":\"10.1007/s00449-025-03187-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R<sup>2</sup> values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.</p>\",\"PeriodicalId\":9024,\"journal\":{\"name\":\"Bioprocess and Biosystems Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioprocess and Biosystems Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s00449-025-03187-5\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioprocess and Biosystems Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00449-025-03187-5","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

大曲是中国传统的酿造原料,在酿造过程中具有糖化和发酵的双重功能。大曲发酵过程中酸度的高低直接影响大曲的品质。传统的测定大曲酸度的方法复杂且存在滞后性,难以实时监测发酵酸度。鉴于大曲酸度与环境变量之间存在较强的相关性,本文提出了基于KNN-Attention-LSTM-XGBoost模型的大曲酸度时间序列预测模型。在收集和分析大曲微环境参数的基础上,利用XGBoost模型选择了两种最优的估算方法(LFBI和KNN)。采用偏最小二乘回归(PLSR)提取关键参数,并利用滞后和滚动窗口方法进行特征提取,以捕捉时间趋势和波动。对比分析发现,KNN预处理联合attentlstm - xgboost模型对大渠酸度的预测效果最好,上、中、下大渠层的R2分别达到0.9790、0.9768、0.9636。该组合优于LSTM-XGBoost和XGBoost模型,与LSTM-XGBoost相比分别提高了3.87%、1.11%和2.84%,与XGBoost相比分别提高了4.70%、4.37%和8.46%。本研究解决了发酵过程中大曲酸度预测的挑战,并为大曲发酵过程的优化提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A study on time-series prediction and analysis of acidity of Daqu based on multivariate data fusion and KNN-Attention-LSTM-XGBoost modeling.

Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R2 values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioprocess and Biosystems Engineering
Bioprocess and Biosystems Engineering 工程技术-工程:化工
CiteScore
7.90
自引率
2.60%
发文量
147
审稿时长
2.6 months
期刊介绍: Bioprocess and Biosystems Engineering provides an international peer-reviewed forum to facilitate the discussion between engineering and biological science to find efficient solutions in the development and improvement of bioprocesses. The aim of the journal is to focus more attention on the multidisciplinary approaches for integrative bioprocess design. Of special interest are the rational manipulation of biosystems through metabolic engineering techniques to provide new biocatalysts as well as the model based design of bioprocesses (up-stream processing, bioreactor operation and downstream processing) that will lead to new and sustainable production processes. Contributions are targeted at new approaches for rational and evolutive design of cellular systems by taking into account the environment and constraints of technical production processes, integration of recombinant technology and process design, as well as new hybrid intersections such as bioinformatics and process systems engineering. Manuscripts concerning the design, simulation, experimental validation, control, and economic as well as ecological evaluation of novel processes using biosystems or parts thereof (e.g., enzymes, microorganisms, mammalian cells, plant cells, or tissue), their related products, or technical devices are also encouraged. The Editors will consider papers for publication based on novelty, their impact on biotechnological production and their contribution to the advancement of bioprocess and biosystems engineering science. Submission of papers dealing with routine aspects of bioprocess engineering (e.g., routine application of established methodologies, and description of established equipment) are discouraged.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信