Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu
{"title":"基于多元数据融合和KNN-Attention-LSTM-XGBoost模型的大曲酸度时间序列预测分析研究。","authors":"Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu","doi":"10.1007/s00449-025-03187-5","DOIUrl":null,"url":null,"abstract":"<p><p>Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R<sup>2</sup> values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.</p>","PeriodicalId":9024,"journal":{"name":"Bioprocess and Biosystems Engineering","volume":" ","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A study on time-series prediction and analysis of acidity of Daqu based on multivariate data fusion and KNN-Attention-LSTM-XGBoost modeling.\",\"authors\":\"Haili Yang, Hao Xia, Sai Liu, Shan Chen, Lan Li, Xilong Liao, Lei Fei, Liangliang Xie, Jianping Tian, Xinjun Hu\",\"doi\":\"10.1007/s00449-025-03187-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R<sup>2</sup> values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.</p>\",\"PeriodicalId\":9024,\"journal\":{\"name\":\"Bioprocess and Biosystems Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioprocess and Biosystems Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s00449-025-03187-5\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioprocess and Biosystems Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00449-025-03187-5","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
A study on time-series prediction and analysis of acidity of Daqu based on multivariate data fusion and KNN-Attention-LSTM-XGBoost modeling.
Daqu is a traditional Chinese brewing ingredient that serves dual functions of saccharification and fermentation during the brewing process. The acidity content during the Daqu fermentation process directly affects the quality of the Daqu. Traditional methods for measuring Daqu acidity are complex and exhibit lag, making it difficult to monitor fermentation acidity in real time. Given the strong correlation between Daqu acidity and environmental variables, this paper proposes a time series prediction model for Daqu acidity based on the KNN-Attention-LSTM-XGBoost model. Upon collecting and analyzing the microenvironmental parameters of Daqu, the XGBoost model was used to select two optimal imputation methods (LFBI and KNN). Partial Least Squares Regression (PLSR) was employed to extract key parameters, and feature extraction using the lag and rolling window methods was performed to capture temporal trends and fluctuations. Comparative analysis revealed that KNN preprocessing combined with the Attention-LSTM-XGBoost model performed best in predicting Daqu acidity, with R2 values reaching 0.9790, 0.9768, and 0.9636 for the upper, middle, and lower Daqu layers, respectively. This combination outperformed the LSTM-XGBoost and XGBoost models, with improvements of 3.87%, 1.11%, and 2.84% compared to LSTM-XGBoost, and 4.70%, 4.37%, and 8.46% compared to XGBoost. This study addresses the challenge of predicting Daqu acidity during fermentation and provides insights into the optimization of the Daqu fermentation process.
期刊介绍:
Bioprocess and Biosystems Engineering provides an international peer-reviewed forum to facilitate the discussion between engineering and biological science to find efficient solutions in the development and improvement of bioprocesses. The aim of the journal is to focus more attention on the multidisciplinary approaches for integrative bioprocess design. Of special interest are the rational manipulation of biosystems through metabolic engineering techniques to provide new biocatalysts as well as the model based design of bioprocesses (up-stream processing, bioreactor operation and downstream processing) that will lead to new and sustainable production processes.
Contributions are targeted at new approaches for rational and evolutive design of cellular systems by taking into account the environment and constraints of technical production processes, integration of recombinant technology and process design, as well as new hybrid intersections such as bioinformatics and process systems engineering. Manuscripts concerning the design, simulation, experimental validation, control, and economic as well as ecological evaluation of novel processes using biosystems or parts thereof (e.g., enzymes, microorganisms, mammalian cells, plant cells, or tissue), their related products, or technical devices are also encouraged.
The Editors will consider papers for publication based on novelty, their impact on biotechnological production and their contribution to the advancement of bioprocess and biosystems engineering science. Submission of papers dealing with routine aspects of bioprocess engineering (e.g., routine application of established methodologies, and description of established equipment) are discouraged.