Provable learning of noisy-OR networks

Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski
{"title":"Provable learning of noisy-OR networks","authors":"Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski","doi":"10.1145/3055399.3055482","DOIUrl":null,"url":null,"abstract":"Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Learning the model ---that is, the mapping from hidden variables to visible ones and vice versa---is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structure: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy-OR network, which is a textbook example of a bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3055399.3055482","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Learning the model ---that is, the mapping from hidden variables to visible ones and vice versa---is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structure: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy-OR network, which is a textbook example of a bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.
噪声或网络的可证明学习
许多机器学习应用程序使用潜在变量模型来解释数据中的结构,由此可见变量(=给定数据点的坐标)被解释为一些隐藏变量的概率函数。学习模型——即从隐藏变量到可见变量的映射,反之亦然——即使在非常简单的设置中也是np困难的。近年来,针对线性结构的模型,如主题模型、混合模型、隐马尔可夫模型等,开发了可证明有效的算法。这些算法使用矩阵或张量分解,并对底层模型的参数做出一些合理的假设。但是当潜在变量模型具有非线性时,矩阵或张量分解似乎没有什么用处。目前的论文展示了如何取得进展:张量分解用于学习单层噪声或网络,这是贝叶斯网络的教科书示例,例如在经典的QMR-DT软件中用于通过观察患者表现出的症状来诊断患者可能患有哪种疾病。这里的技术新颖性,应该在未来的其他设置中有用,是在存在系统误差的情况下分析张量分解(即,噪声/误差与信号相关,并且随着样本数量趋于无穷大而不减少)。这需要从头开始重新思考张量分解方法的所有步骤。为简单起见,我们的分析假设网络参数是从概率分布中选择的,但这种方法似乎更适用于一般情况。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信