HSNP-Miner:高实用自适应无重叠模式挖掘

Motaher Hossain, Youxi Wu, Philippe Fournier-Viger, Zhao Li, Lei Guo, Yan Li
{"title":"HSNP-Miner:高实用自适应无重叠模式挖掘","authors":"Motaher Hossain, Youxi Wu, Philippe Fournier-Viger, Zhao Li, Lei Guo, Yan Li","doi":"10.1109/ICKG52313.2021.00019","DOIUrl":null,"url":null,"abstract":"Sequential pattern mining (SPM) under the nonoverlapping condition (or nonoverlapping SPM) is a type of data mining used to extract frequent gapped subsequences (known as patterns) from sequences, which is more valuable and versatile than other related methods. In nonoverlapping SPM, two occurrences cannot reuse the same sequence letter in the exact location as the occurrences. This method evaluates the frequency of the patterns in the sequence, and ignores the impact of external utility (item price or profit). Therefore, some low-frequency and essential patterns are overlooked. To address this issue, this paper introduces High Utility Self-adaptive Nonoverlapping Pattern (HSNP) mining and proposes HSNP-Miner, which includes two steps: support calculation and candi-date pattern generation. To calculate the support, we propose the NoSup algorithm, which can effectively calculate support while avoiding the creation of redundant nodes. An advanced upper bound method is employed to generate the candidate patterns more efficiently. Compared to other competitive methods, the experimental results demonstrate the efficiency of the proposed algorithm and the uniqueness of nonoverlapping sequence pat-tarns.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"HSNP-Miner: High Utility Self-Adaptive Nonoverlapping Pattern Mining\",\"authors\":\"Motaher Hossain, Youxi Wu, Philippe Fournier-Viger, Zhao Li, Lei Guo, Yan Li\",\"doi\":\"10.1109/ICKG52313.2021.00019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequential pattern mining (SPM) under the nonoverlapping condition (or nonoverlapping SPM) is a type of data mining used to extract frequent gapped subsequences (known as patterns) from sequences, which is more valuable and versatile than other related methods. In nonoverlapping SPM, two occurrences cannot reuse the same sequence letter in the exact location as the occurrences. This method evaluates the frequency of the patterns in the sequence, and ignores the impact of external utility (item price or profit). Therefore, some low-frequency and essential patterns are overlooked. To address this issue, this paper introduces High Utility Self-adaptive Nonoverlapping Pattern (HSNP) mining and proposes HSNP-Miner, which includes two steps: support calculation and candi-date pattern generation. To calculate the support, we propose the NoSup algorithm, which can effectively calculate support while avoiding the creation of redundant nodes. An advanced upper bound method is employed to generate the candidate patterns more efficiently. Compared to other competitive methods, the experimental results demonstrate the efficiency of the proposed algorithm and the uniqueness of nonoverlapping sequence pat-tarns.\",\"PeriodicalId\":174126,\"journal\":{\"name\":\"2021 IEEE International Conference on Big Knowledge (ICBK)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Big Knowledge (ICBK)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKG52313.2021.00019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Big Knowledge (ICBK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKG52313.2021.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

非重叠条件下的顺序模式挖掘(SPM)是一种用于从序列中提取频繁间隙子序列(称为模式)的数据挖掘方法,它比其他相关方法更有价值和通用性。在非重叠SPM中,两个序列不能在相同的位置重复使用相同的序列字母。该方法评估序列中模式的频率,并忽略外部效用(项目价格或利润)的影响。因此,忽略了一些低频和基本模式。为了解决这一问题,本文引入了HSNP (High Utility Self-adaptive non - overlap Pattern)挖掘方法,并提出了HSNP- miner算法,该算法包括支持度计算和候选数据模式生成两个步骤。为了计算支持度,我们提出了NoSup算法,该算法可以有效地计算支持度,同时避免冗余节点的产生。采用一种先进的上界方法,更有效地生成候选模式。与其他竞争方法相比,实验结果证明了该算法的有效性和非重叠序列模式的唯一性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
HSNP-Miner: High Utility Self-Adaptive Nonoverlapping Pattern Mining
Sequential pattern mining (SPM) under the nonoverlapping condition (or nonoverlapping SPM) is a type of data mining used to extract frequent gapped subsequences (known as patterns) from sequences, which is more valuable and versatile than other related methods. In nonoverlapping SPM, two occurrences cannot reuse the same sequence letter in the exact location as the occurrences. This method evaluates the frequency of the patterns in the sequence, and ignores the impact of external utility (item price or profit). Therefore, some low-frequency and essential patterns are overlooked. To address this issue, this paper introduces High Utility Self-adaptive Nonoverlapping Pattern (HSNP) mining and proposes HSNP-Miner, which includes two steps: support calculation and candi-date pattern generation. To calculate the support, we propose the NoSup algorithm, which can effectively calculate support while avoiding the creation of redundant nodes. An advanced upper bound method is employed to generate the candidate patterns more efficiently. Compared to other competitive methods, the experimental results demonstrate the efficiency of the proposed algorithm and the uniqueness of nonoverlapping sequence pat-tarns.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信