Felix C. Oettl, Jacob F. Oeding, Robert Feldt, Christophe Ley, Michael T. Hirschmann, Kristian Samuelsson, ESSKA Artificial Intelligence Working Group
{"title":"人工智能优势:为探索性数据分析增添动力","authors":"Felix C. Oettl, Jacob F. Oeding, Robert Feldt, Christophe Ley, Michael T. Hirschmann, Kristian Samuelsson, ESSKA Artificial Intelligence Working Group","doi":"10.1002/ksa.12389","DOIUrl":null,"url":null,"abstract":"<p>Explorative data analysis (EDA) is a critical step in scientific projects, aiming to uncover valuable insights and patterns within data. Traditionally, EDA involves manual inspection, visualization, and various statistical methods. The advent of artificial intelligence (AI) and machine learning (ML) has the potential to improve EDA, offering more sophisticated approaches that enhance its efficacy. This review explores how AI and ML algorithms can improve feature engineering and selection during EDA, leading to more robust predictive models and data-driven decisions. Tree-based models, regularized regression, and clustering algorithms were identified as key techniques. These methods automate feature importance ranking, handle complex interactions, perform feature selection, reveal hidden groupings, and detect anomalies. Real-world applications include risk prediction in total hip arthroplasty and subgroup identification in scoliosis patients. Recent advances in explainable AI and EDA automation show potential for further improvement. The integration of AI and ML into EDA accelerates tasks and uncovers sophisticated insights. However, effective utilization requires a deep understanding of the algorithms, their assumptions, and limitations, along with domain knowledge for proper interpretation. As data continues to grow, AI will play an increasingly pivotal role in EDA when combined with human expertise, driving more informed, data-driven decision-making across various scientific domains.</p><p><b>Level of Evidence:</b> Level V - Expert opinion.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ksa.12389","citationCount":"0","resultStr":"{\"title\":\"The artificial intelligence advantage: Supercharging exploratory data analysis\",\"authors\":\"Felix C. Oettl, Jacob F. Oeding, Robert Feldt, Christophe Ley, Michael T. Hirschmann, Kristian Samuelsson, ESSKA Artificial Intelligence Working Group\",\"doi\":\"10.1002/ksa.12389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Explorative data analysis (EDA) is a critical step in scientific projects, aiming to uncover valuable insights and patterns within data. Traditionally, EDA involves manual inspection, visualization, and various statistical methods. The advent of artificial intelligence (AI) and machine learning (ML) has the potential to improve EDA, offering more sophisticated approaches that enhance its efficacy. This review explores how AI and ML algorithms can improve feature engineering and selection during EDA, leading to more robust predictive models and data-driven decisions. Tree-based models, regularized regression, and clustering algorithms were identified as key techniques. These methods automate feature importance ranking, handle complex interactions, perform feature selection, reveal hidden groupings, and detect anomalies. Real-world applications include risk prediction in total hip arthroplasty and subgroup identification in scoliosis patients. Recent advances in explainable AI and EDA automation show potential for further improvement. The integration of AI and ML into EDA accelerates tasks and uncovers sophisticated insights. However, effective utilization requires a deep understanding of the algorithms, their assumptions, and limitations, along with domain knowledge for proper interpretation. As data continues to grow, AI will play an increasingly pivotal role in EDA when combined with human expertise, driving more informed, data-driven decision-making across various scientific domains.</p><p><b>Level of Evidence:</b> Level V - Expert opinion.</p>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ksa.12389\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ksa.12389\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ksa.12389","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
探索性数据分析(EDA)是科学项目中的一个关键步骤,旨在从数据中发现有价值的见解和模式。传统上,EDA 包括人工检查、可视化和各种统计方法。人工智能(AI)和机器学习(ML)的出现有可能改善 EDA,提供更复杂的方法来提高其功效。本综述探讨了人工智能和 ML 算法如何在 EDA 过程中改进特征工程和选择,从而建立更强大的预测模型和数据驱动决策。基于树的模型、正则化回归和聚类算法被认为是关键技术。这些方法可自动进行特征重要性排序、处理复杂的交互、执行特征选择、揭示隐藏的分组以及检测异常。现实世界中的应用包括全髋关节置换术中的风险预测和脊柱侧弯患者的亚组识别。可解释人工智能和 EDA 自动化的最新进展显示了进一步改进的潜力。将人工智能和 ML 集成到 EDA 中可加快任务执行速度,并发掘复杂的洞察力。然而,要有效利用人工智能和 ML,就必须深入了解算法、算法假设和局限性,并掌握相关领域的知识,才能做出正确的解释。随着数据的不断增长,人工智能与人类专业知识相结合,将在 EDA 中发挥越来越关键的作用,推动各科学领域做出更明智、数据驱动的决策。证据等级:第五级--专家意见。
The artificial intelligence advantage: Supercharging exploratory data analysis
Explorative data analysis (EDA) is a critical step in scientific projects, aiming to uncover valuable insights and patterns within data. Traditionally, EDA involves manual inspection, visualization, and various statistical methods. The advent of artificial intelligence (AI) and machine learning (ML) has the potential to improve EDA, offering more sophisticated approaches that enhance its efficacy. This review explores how AI and ML algorithms can improve feature engineering and selection during EDA, leading to more robust predictive models and data-driven decisions. Tree-based models, regularized regression, and clustering algorithms were identified as key techniques. These methods automate feature importance ranking, handle complex interactions, perform feature selection, reveal hidden groupings, and detect anomalies. Real-world applications include risk prediction in total hip arthroplasty and subgroup identification in scoliosis patients. Recent advances in explainable AI and EDA automation show potential for further improvement. The integration of AI and ML into EDA accelerates tasks and uncovers sophisticated insights. However, effective utilization requires a deep understanding of the algorithms, their assumptions, and limitations, along with domain knowledge for proper interpretation. As data continues to grow, AI will play an increasingly pivotal role in EDA when combined with human expertise, driving more informed, data-driven decision-making across various scientific domains.