Quantlyzer: An R Package for Automated Exploratory and Predictive Data Analysis

2023 Omaha, Nebraska July 9-12, 2023 Pub Date : 2023-07-09 DOI:10.13031/aim.202300470

Fengkai Tian, Jianfeng Zhou, N. Aloysius, Edward J. Mirielli

{"title":"Quantlyzer: An R Package for Automated Exploratory and Predictive Data Analysis","authors":"Fengkai Tian, Jianfeng Zhou, N. Aloysius, Edward J. Mirielli","doi":"10.13031/aim.202300470","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) and statistical algorithms have been significantly used in various applications, such as data classification, predictive regression, and feature selection. As the need for data-driven insights continues to grow, there is an increasing demand for exploratory and predictive data analysis to support business decision-making, academic research, and other applications. Identifying the best model that has optimal performance for a specific dataset usually consumes much time depending on the purpose of analysis. Although there are many packages that provide pre-built machine learning or statistical models, users still need time to load various suitable packages or functions, optimization of hyperparameters, validate the model, acknowledge the statistical relationship between each random or bivariate variable, and so on. This paper presents a package for R, “Quantlyzer” that contains various po pular algorithms from machine learning and statistics. This tool aims to make automated data analysis more convenient for all different levels of users from no data analytics experience to domain experts to improve the efficiency of analyzing data. A workflow pipeline of exploratory analytics that contains various popular descriptive analysis techniques (e.g., Pearson Correlation Coefficient, a statistical summary of each variable, data visualization), statistical algorithms","PeriodicalId":186509,"journal":{"name":"2023 Omaha, Nebraska July 9-12, 2023","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Omaha, Nebraska July 9-12, 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13031/aim.202300470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning (ML) and statistical algorithms have been significantly used in various applications, such as data classification, predictive regression, and feature selection. As the need for data-driven insights continues to grow, there is an increasing demand for exploratory and predictive data analysis to support business decision-making, academic research, and other applications. Identifying the best model that has optimal performance for a specific dataset usually consumes much time depending on the purpose of analysis. Although there are many packages that provide pre-built machine learning or statistical models, users still need time to load various suitable packages or functions, optimization of hyperparameters, validate the model, acknowledge the statistical relationship between each random or bivariate variable, and so on. This paper presents a package for R, “Quantlyzer” that contains various po pular algorithms from machine learning and statistics. This tool aims to make automated data analysis more convenient for all different levels of users from no data analytics experience to domain experts to improve the efficiency of analyzing data. A workflow pipeline of exploratory analytics that contains various popular descriptive analysis techniques (e.g., Pearson Correlation Coefficient, a statistical summary of each variable, data visualization), statistical algorithms

查看原文本刊更多论文

Quantlyzer:一个用于自动探索和预测数据分析的R包

机器学习(ML)和统计算法已经在各种应用中得到了广泛的应用，例如数据分类、预测回归和特征选择。随着对数据驱动洞察的需求不断增长，对探索性和预测性数据分析的需求也在不断增加，以支持业务决策、学术研究和其他应用。根据分析的目的，为特定数据集确定具有最佳性能的最佳模型通常需要花费大量时间。尽管有许多软件包提供了预构建的机器学习或统计模型，但用户仍然需要时间来加载各种合适的软件包或函数，优化超参数，验证模型，确认每个随机或二元变量之间的统计关系，等等。本文介绍了R的软件包“Quantlyzer”，其中包含来自机器学习和统计学的各种流行算法。该工具旨在为从没有数据分析经验到领域专家的所有不同级别的用户提供更方便的自动化数据分析，以提高分析数据的效率。探索性分析的工作流管道，包含各种流行的描述性分析技术(例如，Pearson相关系数，每个变量的统计摘要，数据可视化)，统计算法

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 Omaha, Nebraska July 9-12, 2023

自引率

0.00%

发文量