Incremental personalized Web page mining utilizing self-organizing HCMAC neural network

Chih-Ming Chen
{"title":"Incremental personalized Web page mining utilizing self-organizing HCMAC neural network","authors":"Chih-Ming Chen","doi":"10.1109/WI.2003.1241172","DOIUrl":null,"url":null,"abstract":"In recent years, information has grown rapidly, especially on the World Wide Web. Also volume of information found by search engines tends to be large, and these documents are not tailored to a user's actual needs and interests. Thus, to offer the personalized service that includes only user interested information become increasingly important. Web mining techniques have proven themselves as a very useful tool for mining information of interests on the Web. However, past pioneers' studies have indicated that the main challenges in Web mining are in terms of handling high-dimensional data, achieving incremental learning (or incremental mining), providing scalable mining and parallel and distributed mining algorithms. We present a novel self-organizing hierarchical CMAC (HCMAC) neural network composed of two-dimensional weighted grey CMACs (WGCMAC) capable of handling both higher dimensional classification problems and self-organizing memory structure according to the distribution of training patterns. Moreover, a learning algorithm that can learn incrementally from new added data without forgetting prior knowledge is proposed to train the self-organizing HCMAC neural network. It can be applied to incrementally learn user profiles from user feedback for identifying personalized Web pages. A benchmark dataset of Web pages ratings that contains four topics of user profiles is used to demonstrate the effectiveness of the proposed method. Experimental results show that the self-organizing HCMAC neural network has a good incrementally learning ability and can overcome the problem of enormous memory requirement in the conventional CMAC while it is applied to solve the higher dimensional classification problems. Furthermore, experiments also confirm that the self-organizing HCMAC neural network has a better forecasting ability to identify user interesting Web pages than other well-known classifiers do.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"310 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2003.1241172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

In recent years, information has grown rapidly, especially on the World Wide Web. Also volume of information found by search engines tends to be large, and these documents are not tailored to a user's actual needs and interests. Thus, to offer the personalized service that includes only user interested information become increasingly important. Web mining techniques have proven themselves as a very useful tool for mining information of interests on the Web. However, past pioneers' studies have indicated that the main challenges in Web mining are in terms of handling high-dimensional data, achieving incremental learning (or incremental mining), providing scalable mining and parallel and distributed mining algorithms. We present a novel self-organizing hierarchical CMAC (HCMAC) neural network composed of two-dimensional weighted grey CMACs (WGCMAC) capable of handling both higher dimensional classification problems and self-organizing memory structure according to the distribution of training patterns. Moreover, a learning algorithm that can learn incrementally from new added data without forgetting prior knowledge is proposed to train the self-organizing HCMAC neural network. It can be applied to incrementally learn user profiles from user feedback for identifying personalized Web pages. A benchmark dataset of Web pages ratings that contains four topics of user profiles is used to demonstrate the effectiveness of the proposed method. Experimental results show that the self-organizing HCMAC neural network has a good incrementally learning ability and can overcome the problem of enormous memory requirement in the conventional CMAC while it is applied to solve the higher dimensional classification problems. Furthermore, experiments also confirm that the self-organizing HCMAC neural network has a better forecasting ability to identify user interesting Web pages than other well-known classifiers do.
基于自组织HCMAC神经网络的渐进式个性化网页挖掘
近年来,信息增长迅速,尤其是在万维网上。此外,搜索引擎发现的信息量往往很大,而且这些文档不是针对用户的实际需求和兴趣量身定制的。因此,提供只包含用户感兴趣的信息的个性化服务变得越来越重要。Web挖掘技术已被证明是在Web上挖掘感兴趣的信息的非常有用的工具。然而,过去的先驱研究表明,Web挖掘的主要挑战在于处理高维数据、实现增量学习(或增量挖掘)、提供可扩展的挖掘以及并行和分布式挖掘算法。提出了一种新的自组织分层CMAC (HCMAC)神经网络,该神经网络由二维加权灰色CMAC (WGCMAC)组成,既能处理高维分类问题,又能根据训练模式的分布自组织记忆结构。此外,提出了一种能够在不忘记先验知识的情况下从新添加的数据中增量学习的学习算法来训练自组织HCMAC神经网络。它可以应用于从用户反馈中逐步学习用户配置文件,以识别个性化的Web页面。使用包含四个用户配置文件主题的网页评级基准数据集来验证所提出方法的有效性。实验结果表明,自组织HCMAC神经网络具有良好的增量学习能力,在解决高维分类问题时克服了传统CMAC对内存需求过大的问题。此外,实验还证实了自组织HCMAC神经网络在识别用户感兴趣的网页方面比其他已知分类器具有更好的预测能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信