Mining Sequential Patterns More Efficiently by Reducing the Cost of Scanning Sequence Databases

Jiahong Wang, Yoshiaki Asanuma, Eiichiro Kodama, T. Takata, Jie Li
{"title":"Mining Sequential Patterns More Efficiently by Reducing the Cost of Scanning Sequence Databases","authors":"Jiahong Wang, Yoshiaki Asanuma, Eiichiro Kodama, T. Takata, Jie Li","doi":"10.2197/IPSJDC.2.768","DOIUrl":null,"url":null,"abstract":"Sequential pattern mining is a useful technique used to discover frequent subsequences as patterns in a sequence database. Depending on the application, sequence databases vary by number of sequences, number of individual items, average length of sequences, and average length of potential patterns. In addition, to discover the necessary patterns in a sequence database, the support threshold may be set to different values. Thus, for a sequential pattern-mining algorithm, responsiveness should be achieved for all of these factors. For that purpose, we propose a candidate-driven pattern-growth sequential pattern-mining algorithm called FSPM (Fast Sequential Pattern Mining). A useful property of FSPM is that the sequential patterns concerning a user-specified item can be mined directly. Extensive experimental results show that, in most cases FSPM outperforms existing algorithms. An analytical performance study shows that it is the inherent potentiality of FSPM that makes it more effective.","PeriodicalId":432390,"journal":{"name":"Ipsj Digital Courier","volume":"8 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ipsj Digital Courier","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJDC.2.768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Sequential pattern mining is a useful technique used to discover frequent subsequences as patterns in a sequence database. Depending on the application, sequence databases vary by number of sequences, number of individual items, average length of sequences, and average length of potential patterns. In addition, to discover the necessary patterns in a sequence database, the support threshold may be set to different values. Thus, for a sequential pattern-mining algorithm, responsiveness should be achieved for all of these factors. For that purpose, we propose a candidate-driven pattern-growth sequential pattern-mining algorithm called FSPM (Fast Sequential Pattern Mining). A useful property of FSPM is that the sequential patterns concerning a user-specified item can be mined directly. Extensive experimental results show that, in most cases FSPM outperforms existing algorithms. An analytical performance study shows that it is the inherent potentiality of FSPM that makes it more effective.
通过降低序列数据库扫描成本更有效地挖掘序列模式
序列模式挖掘是一种有用的技术,用于在序列数据库中发现作为模式的频繁子序列。根据应用程序的不同,序列数据库会随着序列的数量、单个项目的数量、序列的平均长度和潜在模式的平均长度而变化。此外,为了在序列数据库中发现必要的模式,可以将支持阈值设置为不同的值。因此,对于顺序模式挖掘算法,应该实现对所有这些因素的响应性。为此,我们提出了一种候选驱动的模式增长顺序模式挖掘算法,称为FSPM(快速顺序模式挖掘)。FSPM的一个有用特性是可以直接挖掘与用户指定项相关的顺序模式。大量的实验结果表明,在大多数情况下,FSPM优于现有算法。一项分析性能研究表明,FSPM的内在潜力使其更有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信