Optimizing recombinant antibody fragment production: A comparison of artificial intelligence and statistical modeling

IF 3.2 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Majid Basafa, Atieh Hashemi, Aidin Behravan
{"title":"Optimizing recombinant antibody fragment production: A comparison of artificial intelligence and statistical modeling","authors":"Majid Basafa,&nbsp;Atieh Hashemi,&nbsp;Aidin Behravan","doi":"10.1002/bab.2600","DOIUrl":null,"url":null,"abstract":"<p>Maximizing the recombinant protein yield necessitates optimizing the production medium. This can be done using a variety of methods, including the conventional “one-factor-at-a-time” approach and more recent statistical and mathematical methods such as artificial neural network (ANN), genetic algorithm, etc. Every approach has advantages and disadvantages of its own, yet even when a technique has flaws, it is nevertheless used to get the best results. Here, one categorical variable and four numerical parameters, including post-induction time, inducer concentration, post-induction temperature, and pre-induction cell density, were optimized using the 232 experimental assays of the central composite design. The direct and indirect effects of factors on the yield of anti-epithelial cell adhesion molecule extracellular domain fragment antibody were examined using statistical methods. The analysis of variance results indicate that the response surface methodology (RSM) model is effective in predicting the amount of produced single-chain fragment variable (<i>p</i>-value = 0.0001 and <i>R</i><sup>2</sup> = 0.905). For ANN modeling, the evaluation using normalized root mean square error (NRMSE) and <i>R</i><sup>2</sup> values shows a good fit (<i>R</i><sup>2</sup> = 0.942) and accurate predictions (NRMSE = 0.145). The analysis of error parameters and <i>R</i><sup>2</sup> of a dataset, which contained 30 data points randomly selected from the complete dataset, showed that the ANN model had a higher <i>R</i><sup>2</sup> value (0.968) compared to the RSM model (0.932). Furthermore, the ANN model demonstrated stronger predictive ability with a lower NRMSE (0.048 vs. 0.064). Induction at the cell density of 0.7 and an isopropyl β-D-1-thiogalactopyranoside concentration of 0.6 mM for 32 h at 30°C in BW25113 was the ideal culture condition leading to the protein yield of 259.51 mg/L. Under the optimum conditions, the output values predicted by the ANN model (259.83 mg/L) were more in line with the experimental data (259.51 mg/L) than the RSM (276.13 mg/L) expected value. This outcome demonstrated that the ANN model outperforms the RSM in terms of prediction accuracy.</p>","PeriodicalId":9274,"journal":{"name":"Biotechnology and applied biochemistry","volume":"71 5","pages":"1094-1104"},"PeriodicalIF":3.2000,"publicationDate":"2024-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biotechnology and applied biochemistry","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bab.2600","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Maximizing the recombinant protein yield necessitates optimizing the production medium. This can be done using a variety of methods, including the conventional “one-factor-at-a-time” approach and more recent statistical and mathematical methods such as artificial neural network (ANN), genetic algorithm, etc. Every approach has advantages and disadvantages of its own, yet even when a technique has flaws, it is nevertheless used to get the best results. Here, one categorical variable and four numerical parameters, including post-induction time, inducer concentration, post-induction temperature, and pre-induction cell density, were optimized using the 232 experimental assays of the central composite design. The direct and indirect effects of factors on the yield of anti-epithelial cell adhesion molecule extracellular domain fragment antibody were examined using statistical methods. The analysis of variance results indicate that the response surface methodology (RSM) model is effective in predicting the amount of produced single-chain fragment variable (p-value = 0.0001 and R2 = 0.905). For ANN modeling, the evaluation using normalized root mean square error (NRMSE) and R2 values shows a good fit (R2 = 0.942) and accurate predictions (NRMSE = 0.145). The analysis of error parameters and R2 of a dataset, which contained 30 data points randomly selected from the complete dataset, showed that the ANN model had a higher R2 value (0.968) compared to the RSM model (0.932). Furthermore, the ANN model demonstrated stronger predictive ability with a lower NRMSE (0.048 vs. 0.064). Induction at the cell density of 0.7 and an isopropyl β-D-1-thiogalactopyranoside concentration of 0.6 mM for 32 h at 30°C in BW25113 was the ideal culture condition leading to the protein yield of 259.51 mg/L. Under the optimum conditions, the output values predicted by the ANN model (259.83 mg/L) were more in line with the experimental data (259.51 mg/L) than the RSM (276.13 mg/L) expected value. This outcome demonstrated that the ANN model outperforms the RSM in terms of prediction accuracy.

优化重组抗体片段的生产:人工智能与统计建模的比较。
要使重组蛋白产量最大化,就必须优化生产培养基。这可以通过多种方法来实现,包括传统的 "一次一因素 "方法和最新的统计和数学方法,如人工神经网络(ANN)、遗传算法等。每种方法都有其自身的优缺点,但即使某种技术存在缺陷,也会被用来获得最佳结果。在此,利用中心复合设计的 232 项实验测定,优化了一个分类变量和四个数值参数,包括诱导后时间、诱导剂浓度、诱导后温度和诱导前细胞密度。采用统计学方法考察了各因素对抗上皮细胞粘附分子胞外结构域片段抗体产量的直接和间接影响。方差分析结果表明,响应面方法(RSM)模型能有效预测单链片段变量的产量(p 值 = 0.0001,R2 = 0.905)。对于 ANN 建模,使用归一化均方根误差(NRMSE)和 R2 值进行的评估显示,拟合效果良好(R2 = 0.942),预测准确(NRMSE = 0.145)。误差参数和数据集 R2 的分析表明,与 RSM 模型(0.932)相比,ANN 模型具有更高的 R2 值(0.968)。此外,ANN 模型的预测能力更强,NRMSE 更低(0.048 对 0.064)。在 BW25113 细胞密度为 0.7、异丙基 β-D-1-thiogalactopyranoside 浓度为 0.6 mM、温度为 30°C 的条件下诱导 32 小时是理想的培养条件,蛋白质产量为 259.51 mg/L。在最佳条件下,与 RSM 的预期值(276.13 mg/L)相比,ANN 模型预测的输出值(259.83 mg/L)更符合实验数据(259.51 mg/L)。这一结果表明,就预测精度而言,ANN 模型优于 RSM。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biotechnology and applied biochemistry
Biotechnology and applied biochemistry 工程技术-生化与分子生物学
CiteScore
6.00
自引率
7.10%
发文量
117
审稿时长
3 months
期刊介绍: Published since 1979, Biotechnology and Applied Biochemistry is dedicated to the rapid publication of high quality, significant research at the interface between life sciences and their technological exploitation. The Editors will consider papers for publication based on their novelty and impact as well as their contribution to the advancement of medical biotechnology and industrial biotechnology, covering cutting-edge research in synthetic biology, systems biology, metabolic engineering, bioengineering, biomaterials, biosensing, and nano-biotechnology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信