{"title":"Weighted Instance Based Learner (WIBL) for user profiling","authors":"A. Cufoglu, M. Lohi, C. Everiss","doi":"10.1109/SAMI.2012.6208957","DOIUrl":null,"url":null,"abstract":"With an increase in web-based products and services, user profiling has created opportunities for both businesses and other organizations to provide a channel for user awareness as well as to achieve high user satisfaction. Apart from traditional collaborative and content-based methods, a number of classification and clustering algorithms have been used for user profiling. Instance Based Learner (IBL) classifier is a comprehensive form of the Nearest Neighbour (NN) algorithm and it is suitable for user profiling as users with similar profiles are likely to share similar personal interests and preferences. In IBL every attribute has an equal effect on the classification regardless of their relevance. In this paper, we proposed a weighted classification method, namely Weighted Instance Based Learner (WIBL), to build and handle user profiles. With WIBL, we introduce Per Category Feature (PCF) method to IBL in order to distinguish the effect of attributes on classification. PCF is an attribute weighting method and it assigns weights to attributes using conditional probabilities. The direct use of this method with IBL is not possible. Hence, two possible solutions were also proposed to address this problem. This study is aimed to test the performance of WIBL for user profiling. To validate the performance of WIBL, a series of computer simulations were carried out. These simulations were conducted using a large user profile database that includes 5000 training and 1000 test instances (users). Here, each user is represented with three sets of profile information; demographic, interest and preference data. The results illustrate that WIBL with PCF methods performs better than IBL on user profiling by reducing the error up to 28% on the selected dataset.","PeriodicalId":158731,"journal":{"name":"2012 IEEE 10th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 10th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI.2012.6208957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
With an increase in web-based products and services, user profiling has created opportunities for both businesses and other organizations to provide a channel for user awareness as well as to achieve high user satisfaction. Apart from traditional collaborative and content-based methods, a number of classification and clustering algorithms have been used for user profiling. Instance Based Learner (IBL) classifier is a comprehensive form of the Nearest Neighbour (NN) algorithm and it is suitable for user profiling as users with similar profiles are likely to share similar personal interests and preferences. In IBL every attribute has an equal effect on the classification regardless of their relevance. In this paper, we proposed a weighted classification method, namely Weighted Instance Based Learner (WIBL), to build and handle user profiles. With WIBL, we introduce Per Category Feature (PCF) method to IBL in order to distinguish the effect of attributes on classification. PCF is an attribute weighting method and it assigns weights to attributes using conditional probabilities. The direct use of this method with IBL is not possible. Hence, two possible solutions were also proposed to address this problem. This study is aimed to test the performance of WIBL for user profiling. To validate the performance of WIBL, a series of computer simulations were carried out. These simulations were conducted using a large user profile database that includes 5000 training and 1000 test instances (users). Here, each user is represented with three sets of profile information; demographic, interest and preference data. The results illustrate that WIBL with PCF methods performs better than IBL on user profiling by reducing the error up to 28% on the selected dataset.
随着基于web的产品和服务的增加,用户分析为企业和其他组织创造了机会,为用户意识提供了一个渠道,并实现了高用户满意度。除了传统的协作和基于内容的方法外,许多分类和聚类算法已被用于用户分析。基于实例的学习者(IBL)分类器是最近邻(NN)算法的一种综合形式,它适用于用户特征分析,因为具有相似特征的用户可能具有相似的个人兴趣和偏好。在IBL中,无论其相关性如何,每个属性对分类都具有相同的影响。在本文中,我们提出了一种加权分类方法,即加权实例学习器(weighted Instance Based Learner, WIBL)来构建和处理用户档案。在WIBL中,为了区分属性对分类的影响,我们将PCF方法引入到IBL中。PCF是一种属性加权方法,它使用条件概率为属性分配权重。直接使用这种方法治疗IBL是不可能的。因此,还提出了两种可能的解决方案来解决这个问题。本研究旨在测试WIBL用于用户分析的性能。为了验证WIBL的性能,进行了一系列的计算机仿真。这些模拟是使用包含5000个训练和1000个测试实例(用户)的大型用户概要数据库进行的。在这里,每个用户用三组概要信息表示;人口统计、兴趣和偏好数据。结果表明,使用PCF方法的WIBL在用户分析上的表现优于IBL,在所选数据集上的误差减少了28%。