Data-efficient AI models for aerodynamic coefficient prediction: a case study on NACA airfoils with minimal CFD runs

IF 2.6 Q2 MULTIDISCIPLINARY SCIENCES

Beni-Suef University Journal of Basic and Applied Sciences Pub Date : 2026-03-27 DOI:10.1186/s43088-026-00750-1

Abdelrahman Mostafa, Mohamed Adel, Mira M. Suliman, Mofreh Milad, Amr A. Zamel

{"title":"Data-efficient AI models for aerodynamic coefficient prediction: a case study on NACA airfoils with minimal CFD runs","authors":"Abdelrahman Mostafa, Mohamed Adel, Mira M. Suliman, Mofreh Milad, Amr A. Zamel","doi":"10.1186/s43088-026-00750-1","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>High-fidelity computational fluid dynamics (CFD) remains a bottleneck during early aerodynamic design, where many candidate configurations must be screened under strict computational budgets. Data-efficient artificial intelligence (AI) surrogates offer a promising solution to recover aerodynamic coefficients with minimal simulation effort. Within this context, achieving comparable predictive performance using a fraction of the CFD data is essential for efficient surrogate-based design exploration. To reflect real aerodynamic design constraints, this study explicitly investigates how far CFD-generated datasets can be reduced while maintaining predictive reliability, establishing a quantitative balance between accuracy and computational cost.</p><h3>Methods</h3><p>A validated dataset was generated using Reynolds-averaged Navier–Stokes (RANS) simulations with the Spalart–Allmaras turbulence model for four NACA airfoil families, covering a range of Reynolds numbers and angles of attack. To emulate resource-constrained scenarios, the dataset was systematically reduced to 30%, 25%, 20%, 15%, 10%, and 5% of its original size. Ten widely used machine learning algorithms were benchmarked against feedforward backpropagation neural networks with varying hidden layer sizes.</p><h3>Results</h3><p>Across the tested configurations, models maintained strong predictive performance even with substantially reduced datasets. At a 10% training ratio, some models achieved <i>R</i><sup>2</sup> values up to 0.98398 with low mean absolute error, while simulation time decreased from approximately 36 h 40 min to about 3 h 45 min. However, reducing the dataset to 5% resulted in a measurable decline in accuracy, particularly for lift predictions and for deeper neural networks.</p><h3>Conclusions</h3><p>This study demonstrates that AI surrogates can be integrated into CFD workflows to significantly reduce computational cost while preserving predictive accuracy. The findings establish a practical framework for dataset reduction, paving the way for extending this methodology to broader aerodynamic configurations in future studies.</p></div>","PeriodicalId":481,"journal":{"name":"Beni-Suef University Journal of Basic and Applied Sciences","volume":"15 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2026-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s43088-026-00750-1.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Beni-Suef University Journal of Basic and Applied Sciences","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/s43088-026-00750-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Background

High-fidelity computational fluid dynamics (CFD) remains a bottleneck during early aerodynamic design, where many candidate configurations must be screened under strict computational budgets. Data-efficient artificial intelligence (AI) surrogates offer a promising solution to recover aerodynamic coefficients with minimal simulation effort. Within this context, achieving comparable predictive performance using a fraction of the CFD data is essential for efficient surrogate-based design exploration. To reflect real aerodynamic design constraints, this study explicitly investigates how far CFD-generated datasets can be reduced while maintaining predictive reliability, establishing a quantitative balance between accuracy and computational cost.

Methods

A validated dataset was generated using Reynolds-averaged Navier–Stokes (RANS) simulations with the Spalart–Allmaras turbulence model for four NACA airfoil families, covering a range of Reynolds numbers and angles of attack. To emulate resource-constrained scenarios, the dataset was systematically reduced to 30%, 25%, 20%, 15%, 10%, and 5% of its original size. Ten widely used machine learning algorithms were benchmarked against feedforward backpropagation neural networks with varying hidden layer sizes.

Results

Across the tested configurations, models maintained strong predictive performance even with substantially reduced datasets. At a 10% training ratio, some models achieved R² values up to 0.98398 with low mean absolute error, while simulation time decreased from approximately 36 h 40 min to about 3 h 45 min. However, reducing the dataset to 5% resulted in a measurable decline in accuracy, particularly for lift predictions and for deeper neural networks.

Conclusions

This study demonstrates that AI surrogates can be integrated into CFD workflows to significantly reduce computational cost while preserving predictive accuracy. The findings establish a practical framework for dataset reduction, paving the way for extending this methodology to broader aerodynamic configurations in future studies.

查看原文本刊更多论文

数据高效的人工智能模型的空气动力学系数预测：一个案例研究的NACA翼型与最小的CFD运行

高保真计算流体动力学（CFD）仍然是早期气动设计的瓶颈，许多候选配置必须在严格的计算预算下进行筛选。数据高效的人工智能（AI）替代品提供了一种很有前途的解决方案，可以用最少的模拟工作量来恢复空气动力学系数。在这种情况下，利用一小部分CFD数据实现可比的预测性能对于高效的基于代理的设计探索至关重要。为了反映真实的空气动力学设计约束，本研究明确探讨了在保持预测可靠性的同时，cfd生成的数据集可以减少到什么程度，在准确性和计算成本之间建立定量平衡。方法采用基于Spalart-Allmaras湍流模型的四种NACA翼型系列的Reynolds-average Navier-Stokes （RANS）模拟生成验证数据集，涵盖雷诺数和迎角范围。为了模拟资源受限的场景，数据集被系统地缩减到原始大小的30%、25%、20%、15%、10%和5%。十种广泛使用的机器学习算法与具有不同隐藏层大小的前馈反向传播神经网络进行了基准测试。结果在测试配置中，即使数据集大大减少，模型也保持了很强的预测性能。在10%的训练率下，部分模型的R2值达到0.98398，平均绝对误差较低，仿真时间从约36 h 40 min减少到约3 h 45 min。然而，将数据集减少到5%会导致精度明显下降，特别是对于提升预测和更深层次的神经网络。本研究表明，人工智能代理可以集成到CFD工作流程中，在保持预测准确性的同时显著降低计算成本。研究结果为数据集缩减建立了一个实用的框架，为在未来的研究中将该方法扩展到更广泛的空气动力学配置铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Beni-Suef University Journal of Basic and Applied Sciences MULTIDISCIPLINARY SCIENCES-

CiteScore

2.60

自引率

0.00%

发文量

期刊介绍： Beni-Suef University Journal of Basic and Applied Sciences (BJBAS) is a peer-reviewed, open-access journal. This journal welcomes submissions of original research, literature reviews, and editorials in its respected fields of fundamental science, applied science (with a particular focus on the fields of applied nanotechnology and biotechnology), medical sciences, pharmaceutical sciences, and engineering. The multidisciplinary aspects of the journal encourage global collaboration between researchers in multiple fields and provide cross-disciplinary dissemination of findings.