利用h散度对概率自动机进行剪枝

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence Pub Date : 2011-11-07 DOI:10.1109/ICTAI.2011.114

Marc Bernard, Baptiste Jeudy, Jean-Philippe Peyrache, M. Sebban, F. Thollard

{"title":"利用h散度对概率自动机进行剪枝","authors":"Marc Bernard, Baptiste Jeudy, Jean-Philippe Peyrache, M. Sebban, F. Thollard","doi":"10.1109/ICTAI.2011.114","DOIUrl":null,"url":null,"abstract":"A problem usually encountered in probabilistic automata learning is the difficulty to deal with large training samples and/or wide alphabets. This is partially due to the size of the resulting Probabilistic Prefix Tree (PPT) from which state merging-based learning algorithms are generally applied. In this paper, we propose a novel method to prune PPTs by making use of the H-divergence d_H, recently introduced in the field of domain adaptation. d_H is based on the classification error made by an hypothesis learned from unlabeled examples drawn according to two distributions to compare. Through a thorough comparison with state-of-the-art divergence measures, we provide experimental evidences that demonstrate the efficiency of our method based on this simple and intuitive criterion.","PeriodicalId":332661,"journal":{"name":"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using the H-Divergence to Prune Probabilistic Automata\",\"authors\":\"Marc Bernard, Baptiste Jeudy, Jean-Philippe Peyrache, M. Sebban, F. Thollard\",\"doi\":\"10.1109/ICTAI.2011.114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A problem usually encountered in probabilistic automata learning is the difficulty to deal with large training samples and/or wide alphabets. This is partially due to the size of the resulting Probabilistic Prefix Tree (PPT) from which state merging-based learning algorithms are generally applied. In this paper, we propose a novel method to prune PPTs by making use of the H-divergence d_H, recently introduced in the field of domain adaptation. d_H is based on the classification error made by an hypothesis learned from unlabeled examples drawn according to two distributions to compare. Through a thorough comparison with state-of-the-art divergence measures, we provide experimental evidences that demonstrate the efficiency of our method based on this simple and intuitive criterion.\",\"PeriodicalId\":332661,\"journal\":{\"name\":\"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2011.114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 23rd International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2011.114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在概率自动机学习中经常遇到的一个问题是难以处理大型训练样本和/或广泛的字母。这部分是由于结果的概率前缀树(PPT)的大小，通常应用基于状态合并的学习算法。本文提出了一种利用域自适应领域新近引入的h -散度d_H对PPTs进行剪枝的新方法。d_H是基于从根据两个分布进行比较的未标记示例中学习到的假设所产生的分类误差。通过与最先进的散度测量方法的全面比较，我们提供了实验证据，证明了基于该简单直观准则的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using the H-Divergence to Prune Probabilistic Automata

A problem usually encountered in probabilistic automata learning is the difficulty to deal with large training samples and/or wide alphabets. This is partially due to the size of the resulting Probabilistic Prefix Tree (PPT) from which state merging-based learning algorithms are generally applied. In this paper, we propose a novel method to prune PPTs by making use of the H-divergence d_H, recently introduced in the field of domain adaptation. d_H is based on the classification error made by an hypothesis learned from unlabeled examples drawn according to two distributions to compare. Through a thorough comparison with state-of-the-art divergence measures, we provide experimental evidences that demonstrate the efficiency of our method based on this simple and intuitive criterion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence

自引率

0.00%

发文量