Chapter 1: Vectors and Matrices in Data Mining and Pattern Recognition

Matrix Methods in Data Mining and Pattern Recognition, second edition Pub Date : 2019-01-01 DOI:10.1137/1.9781611975864.ch1

{"title":"Chapter 1: Vectors and Matrices in Data Mining and Pattern Recognition","authors":"","doi":"10.1137/1.9781611975864.ch1","DOIUrl":null,"url":null,"abstract":"In modern society huge amounts of data are collected and stored in computers with the purpose of later extracting useful information. Often it is not known at the occasion of collecting the data what information is going to be requested, and therefore the database is often not designed to distill any particular information, but rather it is to a large extent unstructured. The science of extracting useful information from large data sets is usually referred to as “data mining”, sometimes with the addition “knowledge discovery”. Pattern recognition is often considered as a technique separate from data mining, but its definition is related: “the act of taking in raw data and making an action based on the “category” of the pattern” [31]. In this book we will not emphasize the differences between the concepts. There are numerous application areas of data mining, ranging from e-business [10, 69] to bioinformatics [6], from scientific application such as the classification of volcanos on Venus [21] to information retrieval [3] and Internet search engines [11]. Data mining is a truly interdisciplinary science, where techniques from computer science, statistics and data analysis, linear algebra and optimization are used, often in a rather eclectic manner. Due to the practical importance of the applications, there are now numerous books and surveys in the area. We cite a few here: [24, 25, 31, 35, 45, 46, 47, 49, 108]. It is not an exaggeration to state that everyday life is filled with situations in which we depend, often unknowingly, on advanced mathematical methods for data mining. Linear algebra and data analysis are basic ingredients in many data mining techniques. This book will give an introduction to the mathematical and numerical methods, and their use in data mining and pattern recognition.","PeriodicalId":130597,"journal":{"name":"Matrix Methods in Data Mining and Pattern Recognition, second edition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Matrix Methods in Data Mining and Pattern Recognition, second edition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611975864.ch1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In modern society huge amounts of data are collected and stored in computers with the purpose of later extracting useful information. Often it is not known at the occasion of collecting the data what information is going to be requested, and therefore the database is often not designed to distill any particular information, but rather it is to a large extent unstructured. The science of extracting useful information from large data sets is usually referred to as “data mining”, sometimes with the addition “knowledge discovery”. Pattern recognition is often considered as a technique separate from data mining, but its definition is related: “the act of taking in raw data and making an action based on the “category” of the pattern” [31]. In this book we will not emphasize the differences between the concepts. There are numerous application areas of data mining, ranging from e-business [10, 69] to bioinformatics [6], from scientific application such as the classification of volcanos on Venus [21] to information retrieval [3] and Internet search engines [11]. Data mining is a truly interdisciplinary science, where techniques from computer science, statistics and data analysis, linear algebra and optimization are used, often in a rather eclectic manner. Due to the practical importance of the applications, there are now numerous books and surveys in the area. We cite a few here: [24, 25, 31, 35, 45, 46, 47, 49, 108]. It is not an exaggeration to state that everyday life is filled with situations in which we depend, often unknowingly, on advanced mathematical methods for data mining. Linear algebra and data analysis are basic ingredients in many data mining techniques. This book will give an introduction to the mathematical and numerical methods, and their use in data mining and pattern recognition.

查看原文本刊更多论文

第一章:数据挖掘和模式识别中的向量和矩阵

在现代社会，大量的数据被收集并存储在计算机中，目的是为了以后提取有用的信息。在收集数据时，通常不知道将请求什么信息，因此数据库通常不是设计为提取任何特定信息，而是在很大程度上是非结构化的。从大型数据集中提取有用信息的科学通常被称为“数据挖掘”，有时会加上“知识发现”。模式识别通常被认为是一种与数据挖掘分离的技术，但其定义是相关的:“获取原始数据并根据模式的“类别”做出动作的行为”[31]。在本书中，我们不会强调这些概念之间的区别。数据挖掘的应用领域众多，从电子商务[10,69]到生物信息学[6]，从金星火山分类[21]等科学应用到信息检索[3]和互联网搜索引擎[11]。数据挖掘是一门真正的跨学科科学，其中使用了计算机科学，统计学和数据分析，线性代数和优化技术，通常以相当折衷的方式使用。由于应用的实际重要性，现在有许多关于该领域的书籍和调查。我们在这里引用几个:[24,25,31,35,45,46,47,49,108]。可以毫不夸张地说，在日常生活中，我们常常不知不觉地依赖于先进的数学方法来进行数据挖掘。线性代数和数据分析是许多数据挖掘技术的基本组成部分。本书将介绍数学和数值方法，以及它们在数据挖掘和模式识别中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Matrix Methods in Data Mining and Pattern Recognition, second edition

自引率

0.00%

发文量