Chapter 1: Vectors and Matrices in Data Mining and Pattern Recognition

{"title":"Chapter 1: Vectors and Matrices in Data Mining and Pattern Recognition","authors":"","doi":"10.1137/1.9781611975864.ch1","DOIUrl":null,"url":null,"abstract":"In modern society huge amounts of data are collected and stored in computers with the purpose of later extracting useful information. Often it is not known at the occasion of collecting the data what information is going to be requested, and therefore the database is often not designed to distill any particular information, but rather it is to a large extent unstructured. The science of extracting useful information from large data sets is usually referred to as “data mining”, sometimes with the addition “knowledge discovery”. Pattern recognition is often considered as a technique separate from data mining, but its definition is related: “the act of taking in raw data and making an action based on the “category” of the pattern” [31]. In this book we will not emphasize the differences between the concepts. There are numerous application areas of data mining, ranging from e-business [10, 69] to bioinformatics [6], from scientific application such as the classification of volcanos on Venus [21] to information retrieval [3] and Internet search engines [11]. Data mining is a truly interdisciplinary science, where techniques from computer science, statistics and data analysis, linear algebra and optimization are used, often in a rather eclectic manner. Due to the practical importance of the applications, there are now numerous books and surveys in the area. We cite a few here: [24, 25, 31, 35, 45, 46, 47, 49, 108]. It is not an exaggeration to state that everyday life is filled with situations in which we depend, often unknowingly, on advanced mathematical methods for data mining. Linear algebra and data analysis are basic ingredients in many data mining techniques. This book will give an introduction to the mathematical and numerical methods, and their use in data mining and pattern recognition.","PeriodicalId":130597,"journal":{"name":"Matrix Methods in Data Mining and Pattern Recognition, second edition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Matrix Methods in Data Mining and Pattern Recognition, second edition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611975864.ch1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In modern society huge amounts of data are collected and stored in computers with the purpose of later extracting useful information. Often it is not known at the occasion of collecting the data what information is going to be requested, and therefore the database is often not designed to distill any particular information, but rather it is to a large extent unstructured. The science of extracting useful information from large data sets is usually referred to as “data mining”, sometimes with the addition “knowledge discovery”. Pattern recognition is often considered as a technique separate from data mining, but its definition is related: “the act of taking in raw data and making an action based on the “category” of the pattern” [31]. In this book we will not emphasize the differences between the concepts. There are numerous application areas of data mining, ranging from e-business [10, 69] to bioinformatics [6], from scientific application such as the classification of volcanos on Venus [21] to information retrieval [3] and Internet search engines [11]. Data mining is a truly interdisciplinary science, where techniques from computer science, statistics and data analysis, linear algebra and optimization are used, often in a rather eclectic manner. Due to the practical importance of the applications, there are now numerous books and surveys in the area. We cite a few here: [24, 25, 31, 35, 45, 46, 47, 49, 108]. It is not an exaggeration to state that everyday life is filled with situations in which we depend, often unknowingly, on advanced mathematical methods for data mining. Linear algebra and data analysis are basic ingredients in many data mining techniques. This book will give an introduction to the mathematical and numerical methods, and their use in data mining and pattern recognition.
第一章:数据挖掘和模式识别中的向量和矩阵
在现代社会,大量的数据被收集并存储在计算机中,目的是为了以后提取有用的信息。在收集数据时,通常不知道将请求什么信息,因此数据库通常不是设计为提取任何特定信息,而是在很大程度上是非结构化的。从大型数据集中提取有用信息的科学通常被称为“数据挖掘”,有时会加上“知识发现”。模式识别通常被认为是一种与数据挖掘分离的技术,但其定义是相关的:“获取原始数据并根据模式的“类别”做出动作的行为”[31]。在本书中,我们不会强调这些概念之间的区别。数据挖掘的应用领域众多,从电子商务[10,69]到生物信息学[6],从金星火山分类[21]等科学应用到信息检索[3]和互联网搜索引擎[11]。数据挖掘是一门真正的跨学科科学,其中使用了计算机科学,统计学和数据分析,线性代数和优化技术,通常以相当折衷的方式使用。由于应用的实际重要性,现在有许多关于该领域的书籍和调查。我们在这里引用几个:[24,25,31,35,45,46,47,49,108]。可以毫不夸张地说,在日常生活中,我们常常不知不觉地依赖于先进的数学方法来进行数据挖掘。线性代数和数据分析是许多数据挖掘技术的基本组成部分。本书将介绍数学和数值方法,以及它们在数据挖掘和模式识别中的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信