{"title":"半结构化数据的有效模式发现","authors":"Zhou Feng, W. Hsu, M. Lee","doi":"10.1109/ICTAI.2005.63","DOIUrl":null,"url":null,"abstract":"The process of discovering frequent patterns from large semistructured data repositories is one of the hardest categories of tree mining problems, since it involves the discovery of unordered embedded tree patterns. Existing work has focused primarily on the discovery of ordered, induced trees. This work proposes a divide-and-conquer algorithm called WTIMiner to discover the complete set of frequent unordered embedded subtrees. The algorithm successfully reduces the complexity of pattern matching and counting problem that a regular tree mining algorithm faces. Experimental results demonstrate the efficiency and scalability of WTIMiner in terms of both time and space","PeriodicalId":294694,"journal":{"name":"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Efficient pattern discovery for semistructured data\",\"authors\":\"Zhou Feng, W. Hsu, M. Lee\",\"doi\":\"10.1109/ICTAI.2005.63\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The process of discovering frequent patterns from large semistructured data repositories is one of the hardest categories of tree mining problems, since it involves the discovery of unordered embedded tree patterns. Existing work has focused primarily on the discovery of ordered, induced trees. This work proposes a divide-and-conquer algorithm called WTIMiner to discover the complete set of frequent unordered embedded subtrees. The algorithm successfully reduces the complexity of pattern matching and counting problem that a regular tree mining algorithm faces. Experimental results demonstrate the efficiency and scalability of WTIMiner in terms of both time and space\",\"PeriodicalId\":294694,\"journal\":{\"name\":\"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2005.63\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2005.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient pattern discovery for semistructured data
The process of discovering frequent patterns from large semistructured data repositories is one of the hardest categories of tree mining problems, since it involves the discovery of unordered embedded tree patterns. Existing work has focused primarily on the discovery of ordered, induced trees. This work proposes a divide-and-conquer algorithm called WTIMiner to discover the complete set of frequent unordered embedded subtrees. The algorithm successfully reduces the complexity of pattern matching and counting problem that a regular tree mining algorithm faces. Experimental results demonstrate the efficiency and scalability of WTIMiner in terms of both time and space