{"title":"Data mining and knowledge discovery in databases: implications for scientific databases","authors":"U. Fayyad","doi":"10.1109/SSDM.1997.621141","DOIUrl":null,"url":null,"abstract":"Data mining and knowledge discovery in databases (KDD) promise to play an important role in the way people interact with databases, especially scientific databases where analysis and exploration operations are essential. The author defines the basic notions in data mining and KDD, defines the goals, presents motivation, and gives a high-level definition of the KDD process and how it relates to data mining. The author then focuses on data mining methods. Basic coverage of a sampling of methods is provided to illustrate the methods and how they are used. The author covers a case study of a successful application in science data analysis: the classification of cataloging of a major astronomy sky survey covering 2 billion objects in the northern sky. The system can outperform human as well as classical computational analysis tools in astronomy on the task of recognizing faint stars and galaxies. The author also covers the problem of scaling a clustering problem to a large catalog database of billions of objects.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"119","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDM.1997.621141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 119
Abstract
Data mining and knowledge discovery in databases (KDD) promise to play an important role in the way people interact with databases, especially scientific databases where analysis and exploration operations are essential. The author defines the basic notions in data mining and KDD, defines the goals, presents motivation, and gives a high-level definition of the KDD process and how it relates to data mining. The author then focuses on data mining methods. Basic coverage of a sampling of methods is provided to illustrate the methods and how they are used. The author covers a case study of a successful application in science data analysis: the classification of cataloging of a major astronomy sky survey covering 2 billion objects in the northern sky. The system can outperform human as well as classical computational analysis tools in astronomy on the task of recognizing faint stars and galaxies. The author also covers the problem of scaling a clustering problem to a large catalog database of billions of objects.