Model guided algorithm for mining unordered embedded subtrees

F. Hadzic, Henry Tan, T. Dillon
{"title":"Model guided algorithm for mining unordered embedded subtrees","authors":"F. Hadzic, Henry Tan, T. Dillon","doi":"10.3233/WIA-2010-0200","DOIUrl":null,"url":null,"abstract":"Large amount of online information is or can be represented using semi-structured documents, such as XML. The information contained in an XML document can be effectively represented using a rooted ordered labeled tree. This has made the frequent pattern mining problem recast as the frequent subtree mining problem, which is a pre-requisite for association rule mining form tree-structured documents. Driven by different application needs a number of algorithms have been developed for mining of different subtree types under different support definitions. In this paper we present an algorithm for mining unordered embedded subtrees. It is an extension of our general tree model guided (TMG) candidate generation framework and the proposed U3 algorithm considers all support definitions, namely, transaction-based, occurrence-match and hybrid support. A number of experiments are presented on synthetic and real world data sets. The results demonstrate the flexibility of our general TMG framework as well as its efficiency when compared to the existing state-of-the-art approach.","PeriodicalId":263450,"journal":{"name":"Web Intell. Agent Syst.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Web Intell. Agent Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/WIA-2010-0200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Large amount of online information is or can be represented using semi-structured documents, such as XML. The information contained in an XML document can be effectively represented using a rooted ordered labeled tree. This has made the frequent pattern mining problem recast as the frequent subtree mining problem, which is a pre-requisite for association rule mining form tree-structured documents. Driven by different application needs a number of algorithms have been developed for mining of different subtree types under different support definitions. In this paper we present an algorithm for mining unordered embedded subtrees. It is an extension of our general tree model guided (TMG) candidate generation framework and the proposed U3 algorithm considers all support definitions, namely, transaction-based, occurrence-match and hybrid support. A number of experiments are presented on synthetic and real world data sets. The results demonstrate the flexibility of our general TMG framework as well as its efficiency when compared to the existing state-of-the-art approach.
无序嵌入子树的模型导向挖掘算法
大量在线信息是或可以使用半结构化文档(如XML)来表示的。XML文档中包含的信息可以使用根有序标记树有效地表示。这使得频繁模式挖掘问题转化为频繁子树挖掘问题,这是树结构文档中关联规则挖掘的先决条件。根据不同的应用需求,在不同的支持定义下,开发了许多算法来挖掘不同的子树类型。本文提出了一种挖掘无序嵌入子树的算法。它是我们的通用树模型指导(TMG)候选生成框架的扩展,提出的U3算法考虑了所有支持定义,即基于事务的支持、发生匹配支持和混合支持。在合成和真实世界的数据集上进行了一些实验。结果表明,与现有的最先进的方法相比,我们的通用TMG框架具有灵活性和效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信