Performance Implication of Knowledge Discovery Techniques in Databases

B. Rajagopalan, R. Krovi
{"title":"Performance Implication of Knowledge Discovery Techniques in Databases","authors":"B. Rajagopalan, R. Krovi","doi":"10.4018/978-1-59140-063-9.CH009","DOIUrl":null,"url":null,"abstract":"This chapter introduces knowledge discovery techniques as a means of identifying critical trends and patterns for business decision support. It suggests that effective implementation of these techniques requires a careful assessment of the various data mining tools and algorithms available. Both statistical and machine-learning based algorithms have been widely applied to discover knowledge from data. In this chapter we describe some of these algorithms and investigate their relative performance for classification problems. Simulation based results support the proposition that machinelearning algorithms outperform their statistical counterparts, albeit only under certain conditions. Further, the authors hope that the discussion on performance related issues will foster a better understanding of the application and appropriateness of knowledge discovery techniques. 701 E. Chocolate Avenue, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com IDEA GROUP PUBLISHING This chapter appears in the book, Advanced Topics in Database Research, edited by Keng Sia . Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 192 Rajagopalan and Krovi Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. INTRODUCTION The volume of data collected by businesses today is phenomenal and is increasing exponentially. The challenge is to integrate and correlate data related to both online and offline sales, customer satisfaction surveys, and server log files. To this end, data mining (DM) the process of sifting through the mass of organizational (internal and external) data to identify patterns, is critical for decision support. Effective data mining has several applications, like fraud detection and bankruptcy prediction (Tam & Kiang, 1992; Lee, Han, & Kwon, 1996; Kumar, Krovi, & Rajagopalan, 1997), strategic decision-making (Nazem & Shin, 1999), and database marketing (Brachman, R.J. Khabaza, T. Kloesgen, W. PiatetskyShapiro, G. & Simoudis, E, 1996). Today, businesses have the unique opportunity for using such techniques for target marketing and customer relationship management. Analysis of massive data collected by businesses can support intelligence-gathering efforts about their competition, product, or market. Intelligent tools based on rules derived from web mining can also play an important role in personalization related to site content and presentation. Recently, there has been considerable interest on how to integrate and mine such data (Mulvenna, Anand, & Buchner, 2000; Brachman et al., 1996). Business databases in general pose a unique problem for pattern extraction because of their complex nature. This complexity arises from anomalies such as discontinuity, noise, ambiguity, and incompleteness (Fayyad, Piatetsky-Shapiro & Smyth, 1996). Historically, decision makers had to manually deduce patterns using information generated by query reporting systems. One level of analytical sophistication above this was the ability to look at the data and perform analyses such as What-If and goal seeking. More recently, online analytical processing","PeriodicalId":332833,"journal":{"name":"Advanced Topics in Database Research, Vol. 2","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Topics in Database Research, Vol. 2","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-59140-063-9.CH009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This chapter introduces knowledge discovery techniques as a means of identifying critical trends and patterns for business decision support. It suggests that effective implementation of these techniques requires a careful assessment of the various data mining tools and algorithms available. Both statistical and machine-learning based algorithms have been widely applied to discover knowledge from data. In this chapter we describe some of these algorithms and investigate their relative performance for classification problems. Simulation based results support the proposition that machinelearning algorithms outperform their statistical counterparts, albeit only under certain conditions. Further, the authors hope that the discussion on performance related issues will foster a better understanding of the application and appropriateness of knowledge discovery techniques. 701 E. Chocolate Avenue, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com IDEA GROUP PUBLISHING This chapter appears in the book, Advanced Topics in Database Research, edited by Keng Sia . Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 192 Rajagopalan and Krovi Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. INTRODUCTION The volume of data collected by businesses today is phenomenal and is increasing exponentially. The challenge is to integrate and correlate data related to both online and offline sales, customer satisfaction surveys, and server log files. To this end, data mining (DM) the process of sifting through the mass of organizational (internal and external) data to identify patterns, is critical for decision support. Effective data mining has several applications, like fraud detection and bankruptcy prediction (Tam & Kiang, 1992; Lee, Han, & Kwon, 1996; Kumar, Krovi, & Rajagopalan, 1997), strategic decision-making (Nazem & Shin, 1999), and database marketing (Brachman, R.J. Khabaza, T. Kloesgen, W. PiatetskyShapiro, G. & Simoudis, E, 1996). Today, businesses have the unique opportunity for using such techniques for target marketing and customer relationship management. Analysis of massive data collected by businesses can support intelligence-gathering efforts about their competition, product, or market. Intelligent tools based on rules derived from web mining can also play an important role in personalization related to site content and presentation. Recently, there has been considerable interest on how to integrate and mine such data (Mulvenna, Anand, & Buchner, 2000; Brachman et al., 1996). Business databases in general pose a unique problem for pattern extraction because of their complex nature. This complexity arises from anomalies such as discontinuity, noise, ambiguity, and incompleteness (Fayyad, Piatetsky-Shapiro & Smyth, 1996). Historically, decision makers had to manually deduce patterns using information generated by query reporting systems. One level of analytical sophistication above this was the ability to look at the data and perform analyses such as What-If and goal seeking. More recently, online analytical processing
数据库中知识发现技术的性能含义
本章介绍了知识发现技术作为识别关键趋势和商业决策支持模式的一种手段。它表明,这些技术的有效实施需要仔细评估各种可用的数据挖掘工具和算法。基于统计和机器学习的算法已被广泛应用于从数据中发现知识。在本章中,我们描述了其中的一些算法,并研究了它们在分类问题上的相对性能。基于仿真的结果支持机器学习算法优于统计算法的命题,尽管只是在某些条件下。此外,作者希望对性能相关问题的讨论将促进对知识发现技术的应用和适当性的更好理解。美国宾夕法尼亚州好时巧克力大道701号,17033-1240电话:717/533-8845;传真717/533 - 8661;URL-http://www.idea-group.com IDEA GROUP PUBLISHING本章摘自邝夏编辑的《数据库研究高级专题》一书。Idea Group Inc.版权所有©2003未经Idea Group Inc.书面许可,禁止以印刷或电子形式复制或分发。192 Rajagopalan and Krovi版权所有©2003,Idea Group Inc.。未经Idea Group Inc.书面许可,禁止以印刷或电子形式复制或分发。当今企业收集的数据量是惊人的,并且呈指数级增长。挑战在于集成和关联与在线和离线销售、客户满意度调查和服务器日志文件相关的数据。为此,数据挖掘(DM)——筛选大量组织(内部和外部)数据以识别模式的过程——对决策支持至关重要。有效的数据挖掘有几种应用,如欺诈检测和破产预测(Tam & jiang, 1992;Lee, Han, & Kwon, 1996;Kumar, Krovi, & Rajagopalan, 1997),战略决策(Nazem & Shin, 1999),以及数据库营销(Brachman, R.J. Khabaza, T. Kloesgen, W. PiatetskyShapiro, G. & Simoudis, E, 1996)。今天,企业有独特的机会使用这些技术进行目标营销和客户关系管理。对企业收集的大量数据进行分析,可以支持有关竞争、产品或市场的情报收集工作。基于源自web挖掘的规则的智能工具也可以在与站点内容和表示相关的个性化中发挥重要作用。最近,人们对如何整合和挖掘这些数据产生了相当大的兴趣(Mulvenna, Anand, & Buchner, 2000;Brachman et al., 1996)。由于业务数据库的复杂性,它们通常会给模式提取带来独特的问题。这种复杂性源于异常,如不连续、噪声、模糊性和不完整性(Fayyad, Piatetsky-Shapiro & Smyth, 1996)。过去,决策者必须使用查询报告系统生成的信息手动推断模式。在此之上的一个分析复杂性级别是查看数据并执行诸如What-If和目标查找等分析的能力。最近,在线分析处理
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信