使用机器学习的发酵过程中污染检测和减少的方法学。

IF 3.6 3区生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Bioprocess and Biosystems Engineering Pub Date : 2025-09-01 Epub Date: 2025-06-26 DOI:10.1007/s00449-025-03194-6

Xuan Dung James Nguyen, Y A Liu, Christopher C McDowell, Luke Dooley

{"title":"使用机器学习的发酵过程中污染检测和减少的方法学。","authors":"Xuan Dung James Nguyen, Y A Liu, Christopher C McDowell, Luke Dooley","doi":"10.1007/s00449-025-03194-6","DOIUrl":null,"url":null,"abstract":"This paper demonstrates an accurate and efficient methodology for fermentation contamination detection and reduction using two machine learning (ML) methods, including one-class support vector machine and autoencoders. We also optimize as many hyperparameters as possible prior to the training of the ML models to improve the model accuracy and efficiency, and choose a Python platform called Optuna, to enable the parallel execution of hyperparameter optimization (HPO). We recommend using Bayesian optimization with hyperband algorithm to carry out HPO. Results show that we can predict contaminated fermentation batches with recall up to 1.0 without sacrificing the precision and specificity of non-contaminated batches, which read up to 0.96 and 0.99, respectively. One-class support vector machine outperforms autoencoders in terms of precision and specificity even though they both achieve an outstanding recall of 1.0. These models demonstrate high accuracy in detecting contamination without requiring labeled contaminated data and are suitable for integration into real-time fermentation monitoring systems with minimal latency and retraining needs. In addition, we benchmark our ML methods against a traditional threshold-based contamination detection approach (mean <math><mo>±</mo></math> 3 <math><mi>σ</mi></math> rule) to quantify the added value of using data-driven models. Finally, we identify important independent variables contributing to the contaminated batches and give recommendations on how to regulate them to reduce the likelihood of contamination.","PeriodicalId":9024,"journal":{"name":"Bioprocess and Biosystems Engineering","volume":" ","pages":"1547-1563"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367959/pdf/","citationCount":"0","resultStr":"{\"title\":\"Methodology for contamination detection and reduction in fermentation processes using machine learning.\",\"authors\":\"Xuan Dung James Nguyen, Y A Liu, Christopher C McDowell, Luke Dooley\",\"doi\":\"10.1007/s00449-025-03194-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper demonstrates an accurate and efficient methodology for fermentation contamination detection and reduction using two machine learning (ML) methods, including one-class support vector machine and autoencoders. We also optimize as many hyperparameters as possible prior to the training of the ML models to improve the model accuracy and efficiency, and choose a Python platform called Optuna, to enable the parallel execution of hyperparameter optimization (HPO). We recommend using Bayesian optimization with hyperband algorithm to carry out HPO. Results show that we can predict contaminated fermentation batches with recall up to 1.0 without sacrificing the precision and specificity of non-contaminated batches, which read up to 0.96 and 0.99, respectively. One-class support vector machine outperforms autoencoders in terms of precision and specificity even though they both achieve an outstanding recall of 1.0. These models demonstrate high accuracy in detecting contamination without requiring labeled contaminated data and are suitable for integration into real-time fermentation monitoring systems with minimal latency and retraining needs. In addition, we benchmark our ML methods against a traditional threshold-based contamination detection approach (mean <math><mo>±</mo></math> 3 <math><mi>σ</mi></math> rule) to quantify the added value of using data-driven models. Finally, we identify important independent variables contributing to the contaminated batches and give recommendations on how to regulate them to reduce the likelihood of contamination.\",\"PeriodicalId\":9024,\"journal\":{\"name\":\"Bioprocess and Biosystems Engineering\",\"volume\":\" \",\"pages\":\"1547-1563\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367959/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioprocess and Biosystems Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s00449-025-03194-6\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioprocess and Biosystems Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00449-025-03194-6","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/26 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

本文展示了一种准确有效的发酵污染检测和减少方法，使用两种机器学习（ML）方法，包括一类支持向量机和自编码器。我们还在机器学习模型训练之前尽可能多地优化超参数，以提高模型的准确性和效率，并选择了一个名为Optuna的Python平台，以实现超参数优化（HPO）的并行执行。我们建议使用贝叶斯优化和超带算法来实现HPO。结果表明，在不牺牲非污染批次的精密度和特异性的情况下，我们可以预测召回率高达1.0的污染发酵批次，非污染批次的召回率分别高达0.96和0.99。单类支持向量机在精度和特异性方面优于自编码器，尽管它们都达到了1.0的召回率。这些模型在检测污染方面表现出很高的准确性，而不需要标记污染数据，并且适合集成到实时发酵监测系统中，具有最小的延迟和再培训需求。此外，我们将ML方法与传统的基于阈值的污染检测方法（平均值±3 σ规则）进行基准测试，以量化使用数据驱动模型的附加价值。最后，我们确定了导致污染批次的重要自变量，并就如何调节它们以减少污染的可能性给出了建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Methodology for contamination detection and reduction in fermentation processes using machine learning.

查看原文本刊更多论文

Methodology for contamination detection and reduction in fermentation processes using machine learning.

This paper demonstrates an accurate and efficient methodology for fermentation contamination detection and reduction using two machine learning (ML) methods, including one-class support vector machine and autoencoders. We also optimize as many hyperparameters as possible prior to the training of the ML models to improve the model accuracy and efficiency, and choose a Python platform called Optuna, to enable the parallel execution of hyperparameter optimization (HPO). We recommend using Bayesian optimization with hyperband algorithm to carry out HPO. Results show that we can predict contaminated fermentation batches with recall up to 1.0 without sacrificing the precision and specificity of non-contaminated batches, which read up to 0.96 and 0.99, respectively. One-class support vector machine outperforms autoencoders in terms of precision and specificity even though they both achieve an outstanding recall of 1.0. These models demonstrate high accuracy in detecting contamination without requiring labeled contaminated data and are suitable for integration into real-time fermentation monitoring systems with minimal latency and retraining needs. In addition, we benchmark our ML methods against a traditional threshold-based contamination detection approach (mean $\pm$ 3 $σ$ rule) to quantify the added value of using data-driven models. Finally, we identify important independent variables contributing to the contaminated batches and give recommendations on how to regulate them to reduce the likelihood of contamination.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bioprocess and Biosystems Engineering 工程技术-工程：化工

CiteScore

7.90

自引率

2.60%

发文量

147

审稿时长

2.6 months

期刊介绍： Bioprocess and Biosystems Engineering provides an international peer-reviewed forum to facilitate the discussion between engineering and biological science to find efficient solutions in the development and improvement of bioprocesses. The aim of the journal is to focus more attention on the multidisciplinary approaches for integrative bioprocess design. Of special interest are the rational manipulation of biosystems through metabolic engineering techniques to provide new biocatalysts as well as the model based design of bioprocesses (up-stream processing, bioreactor operation and downstream processing) that will lead to new and sustainable production processes. Contributions are targeted at new approaches for rational and evolutive design of cellular systems by taking into account the environment and constraints of technical production processes, integration of recombinant technology and process design, as well as new hybrid intersections such as bioinformatics and process systems engineering. Manuscripts concerning the design, simulation, experimental validation, control, and economic as well as ecological evaluation of novel processes using biosystems or parts thereof (e.g., enzymes, microorganisms, mammalian cells, plant cells, or tissue), their related products, or technical devices are also encouraged. The Editors will consider papers for publication based on novelty, their impact on biotechnological production and their contribution to the advancement of bioprocess and biosystems engineering science. Submission of papers dealing with routine aspects of bioprocess engineering (e.g., routine application of established methodologies, and description of established equipment) are discouraged.