Mohammad Fahim Arefin, Chowdhury Farhan Ahmed, Redwan Ahmed Rizvee, C. Leung, Longbing Cao
{"title":"Mining Contextual Item Similarity without Concept Hierarchy","authors":"Mohammad Fahim Arefin, Chowdhury Farhan Ahmed, Redwan Ahmed Rizvee, C. Leung, Longbing Cao","doi":"10.1109/IMCOM53663.2022.9721788","DOIUrl":null,"url":null,"abstract":"In the modern era, data is precious. Therefore, a huge amount of data is being generated every moment and data mining extracts insight from this data. Item similarity mining is a special domain of data mining that helps discover inherent and important characteristics of a dataset. It is a popular research problem with application in numerous domains. In this work, we propose a novel, symmetric, null-invariant measure of similarity that can evaluate contextual similarity between items, without any additional metadata. We also propose an optimal algorithm for calculating this measure. Moreover, as the optimal algorithm has comparatively high runtime complexity, we propose a heuristic algorithm which generates an approximate result without sacrificing much accuracy. This similarity can be used for mining localized associations and discovering object relationships in large datasets. The results obtained using the proposed measure in six real-life datasets confirm the measure’s effectiveness and versatility in data of varying nature.","PeriodicalId":367038,"journal":{"name":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM53663.2022.9721788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In the modern era, data is precious. Therefore, a huge amount of data is being generated every moment and data mining extracts insight from this data. Item similarity mining is a special domain of data mining that helps discover inherent and important characteristics of a dataset. It is a popular research problem with application in numerous domains. In this work, we propose a novel, symmetric, null-invariant measure of similarity that can evaluate contextual similarity between items, without any additional metadata. We also propose an optimal algorithm for calculating this measure. Moreover, as the optimal algorithm has comparatively high runtime complexity, we propose a heuristic algorithm which generates an approximate result without sacrificing much accuracy. This similarity can be used for mining localized associations and discovering object relationships in large datasets. The results obtained using the proposed measure in six real-life datasets confirm the measure’s effectiveness and versatility in data of varying nature.