利用离群点标记、聚类和关联规则挖掘提高协同过滤性能

2018 5th International Conference on Data and Software Engineering (ICoDSE) Pub Date : 2018-11-01 DOI:10.1109/ICODSE.2018.8705883

Rachmadian Trihatmaja, Yudistira Dwi Wardhana Asnar

{"title":"利用离群点标记、聚类和关联规则挖掘提高协同过滤性能","authors":"Rachmadian Trihatmaja, Yudistira Dwi Wardhana Asnar","doi":"10.1109/ICODSE.2018.8705883","DOIUrl":null,"url":null,"abstract":"Collaborative Filtering (CF) is a popular recommendation method because it can provide recommendations personally. Under conditions of data sparsity, CF recommendation systems are known to have low accuracy because available historical information is not enough to properly identify preferences. Based on experiments conducted by researchers, the factor that limits the accuracy of the CF recommendation system is the number of recommendation items that exceed the system requirements. The existence of outliers with too much items also affect the recommendation results. This study attempted to apply model-based CFs to improve the accuracy of CF recommendations under data sparsity conditions. Conduct outlier labeling, clustering, and association rule mining implemented from preprocessing as a combination of data processing methods to generate recommendation items. Experiments conducted on Groceries dataset, a real-world point-of-sale transactions data from grocery outlet. The results of the evaluation indicate that the proposed method can improve the accuracy of basic CF by 26% with the quality improvement of the recommendation result by 24%.","PeriodicalId":362422,"journal":{"name":"2018 5th International Conference on Data and Software Engineering (ICoDSE)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving the Performance of Collaborative Filtering Using Outlier Labeling, Clustering, and Association Rule Mining\",\"authors\":\"Rachmadian Trihatmaja, Yudistira Dwi Wardhana Asnar\",\"doi\":\"10.1109/ICODSE.2018.8705883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collaborative Filtering (CF) is a popular recommendation method because it can provide recommendations personally. Under conditions of data sparsity, CF recommendation systems are known to have low accuracy because available historical information is not enough to properly identify preferences. Based on experiments conducted by researchers, the factor that limits the accuracy of the CF recommendation system is the number of recommendation items that exceed the system requirements. The existence of outliers with too much items also affect the recommendation results. This study attempted to apply model-based CFs to improve the accuracy of CF recommendations under data sparsity conditions. Conduct outlier labeling, clustering, and association rule mining implemented from preprocessing as a combination of data processing methods to generate recommendation items. Experiments conducted on Groceries dataset, a real-world point-of-sale transactions data from grocery outlet. The results of the evaluation indicate that the proposed method can improve the accuracy of basic CF by 26% with the quality improvement of the recommendation result by 24%.\",\"PeriodicalId\":362422,\"journal\":{\"name\":\"2018 5th International Conference on Data and Software Engineering (ICoDSE)\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Data and Software Engineering (ICoDSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICODSE.2018.8705883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Data and Software Engineering (ICoDSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICODSE.2018.8705883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

协同过滤(CF)是一种流行的推荐方法，因为它可以提供个人推荐。在数据稀疏性条件下，CF推荐系统的准确性很低，因为可用的历史信息不足以正确识别偏好。根据研究人员的实验，限制CF推荐系统准确性的因素是超过系统要求的推荐项目数量。项目过多的异常值的存在也会影响推荐结果。本研究试图应用基于模型的CF来提高数据稀疏条件下CF推荐的准确性。将预处理实现的离群标记、聚类和关联规则挖掘作为数据处理方法的组合来生成推荐项。在杂货数据集上进行的实验，这是一个来自杂货店的真实销售点交易数据。评价结果表明，该方法可将基本CF的准确率提高26%，推荐结果的质量提高24%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Performance of Collaborative Filtering Using Outlier Labeling, Clustering, and Association Rule Mining

Collaborative Filtering (CF) is a popular recommendation method because it can provide recommendations personally. Under conditions of data sparsity, CF recommendation systems are known to have low accuracy because available historical information is not enough to properly identify preferences. Based on experiments conducted by researchers, the factor that limits the accuracy of the CF recommendation system is the number of recommendation items that exceed the system requirements. The existence of outliers with too much items also affect the recommendation results. This study attempted to apply model-based CFs to improve the accuracy of CF recommendations under data sparsity conditions. Conduct outlier labeling, clustering, and association rule mining implemented from preprocessing as a combination of data processing methods to generate recommendation items. Experiments conducted on Groceries dataset, a real-world point-of-sale transactions data from grocery outlet. The results of the evaluation indicate that the proposed method can improve the accuracy of basic CF by 26% with the quality improvement of the recommendation result by 24%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 5th International Conference on Data and Software Engineering (ICoDSE)

自引率

0.00%

发文量