An introduction to statistical learning with applications in R

IF 1.3 Q3 STATISTICS & PROBABILITY

Statistical Theory and Related Fields Pub Date : 2021-09-26 DOI:10.1080/24754269.2021.1980261

Fariha Sohil, Muhammad Umair Sohali, J. Shabbir

{"title":"An introduction to statistical learning with applications in R","authors":"Fariha Sohil, Muhammad Umair Sohali, J. Shabbir","doi":"10.1080/24754269.2021.1980261","DOIUrl":null,"url":null,"abstract":"The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students, or professionals, to efficiently learn the mathematics. This self-contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites. It uses these concepts to derive four central machine learning methods: linear regression, principal component analysis, Gaussian mixture models and support vector machines. For students and others with a mathematical background, these derivations provide a starting point to machine learning texts. For those learning the mathematics for the first time, the methods help build intuition and practical experience with applying mathematical concepts. Every chapter includes worked examples and exercises to test understanding. Programming tutorials are offered on the book's web site. This textbook considers statistical learning applications when interest centers on the conditional distribution of a response variable, given a set of predictors, and in the absence of a credible model that can be specified before the data analysis begins. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis depends in an integrated fashion on sound data collection, intelligent data management, appropriate statistical procedures, and an","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"87 - 87"},"PeriodicalIF":1.3000,"publicationDate":"2021-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2608","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Theory and Related Fields","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1080/24754269.2021.1980261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 2608

Abstract

The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students, or professionals, to efficiently learn the mathematics. This self-contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites. It uses these concepts to derive four central machine learning methods: linear regression, principal component analysis, Gaussian mixture models and support vector machines. For students and others with a mathematical background, these derivations provide a starting point to machine learning texts. For those learning the mathematics for the first time, the methods help build intuition and practical experience with applying mathematical concepts. Every chapter includes worked examples and exercises to test understanding. Programming tutorials are offered on the book's web site. This textbook considers statistical learning applications when interest centers on the conditional distribution of a response variable, given a set of predictors, and in the absence of a credible model that can be specified before the data analysis begins. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis depends in an integrated fashion on sound data collection, intelligent data management, appropriate statistical procedures, and an

查看原文本刊更多论文

统计学习在R中的应用介绍

理解机器学习所需的基本数学工具包括线性代数、解析几何、矩阵分解、向量微积分、优化、概率论和统计学。这些主题传统上是在不同的课程中教授的，这使得数据科学或计算机科学的学生或专业人士很难有效地学习数学。这本自成一体的教科书弥合了数学和机器学习文本之间的差距，以最低的先决条件介绍了数学概念。它使用这些概念推导出四种核心机器学习方法:线性回归、主成分分析、高斯混合模型和支持向量机。对于学生和其他有数学背景的人来说，这些推导为机器学习文本提供了一个起点。对于那些第一次学习数学的人来说，这些方法有助于建立应用数学概念的直觉和实践经验。每一章都包括一些例子和练习来测试理解。本书的网站上提供了编程教程。本教材考虑了统计学习的应用，当兴趣集中在响应变量的条件分布上，给定一组预测因子，并且在数据分析开始之前没有可以指定的可信模型。与现代数据分析一致，它强调正确的统计学习数据分析依赖于健全的数据收集，智能的数据管理，适当的统计程序和一个综合的方式

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Theory and Related Fields Mathematics-Analysis

CiteScore

0.90

自引率

20.00%

发文量