缺失值预测与异常检测的指数族张量分解

K. Hayashi, Takashi Takenouchi, T. Shibata, Yuki Kamiya, Daishi Kato, Kazuo Kunieda, Keiji Yamada, K. Ikeda
{"title":"缺失值预测与异常检测的指数族张量分解","authors":"K. Hayashi, Takashi Takenouchi, T. Shibata, Yuki Kamiya, Daishi Kato, Kazuo Kunieda, Keiji Yamada, K. Ikeda","doi":"10.1109/ICDM.2010.39","DOIUrl":null,"url":null,"abstract":"In this paper, we study probabilistic modeling of heterogeneously attributed multi-dimensional arrays. The model can manage the heterogeneity by employing an individual exponential-family distribution for each attribute of the tensor array. These entries are connected by latent variables and are shared information across the different attributes. Because a Bayesian inference for our model is intractable, we cast the EM algorithm approximated by using the Lap lace method and Gaussian process. This approximation enables us to derive a predictive distribution for missing values in a consistent manner. Simulation experiments show that our method outperforms other methods such as PARAFAC and Tucker decomposition in missing-values prediction for cross-national statistics and is also applicable to discover anomalies in heterogeneous office-logging data.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Exponential Family Tensor Factorization for Missing-Values Prediction and Anomaly Detection\",\"authors\":\"K. Hayashi, Takashi Takenouchi, T. Shibata, Yuki Kamiya, Daishi Kato, Kazuo Kunieda, Keiji Yamada, K. Ikeda\",\"doi\":\"10.1109/ICDM.2010.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study probabilistic modeling of heterogeneously attributed multi-dimensional arrays. The model can manage the heterogeneity by employing an individual exponential-family distribution for each attribute of the tensor array. These entries are connected by latent variables and are shared information across the different attributes. Because a Bayesian inference for our model is intractable, we cast the EM algorithm approximated by using the Lap lace method and Gaussian process. This approximation enables us to derive a predictive distribution for missing values in a consistent manner. Simulation experiments show that our method outperforms other methods such as PARAFAC and Tucker decomposition in missing-values prediction for cross-national statistics and is also applicable to discover anomalies in heterogeneous office-logging data.\",\"PeriodicalId\":294061,\"journal\":{\"name\":\"2010 IEEE International Conference on Data Mining\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2010.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2010.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

本文研究了异构属性多维阵列的概率建模问题。该模型可以通过对张量数组的每个属性采用单独的指数族分布来管理异质性。这些条目通过潜在变量连接,并在不同属性之间共享信息。由于该模型的贝叶斯推理难以处理,我们采用了Lap lace法和高斯过程近似的EM算法。这种近似使我们能够以一致的方式推导出缺失值的预测分布。仿真实验表明,该方法在跨国统计缺失值预测方面优于PARAFAC和Tucker分解等方法,也适用于异构办公日志数据的异常发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exponential Family Tensor Factorization for Missing-Values Prediction and Anomaly Detection
In this paper, we study probabilistic modeling of heterogeneously attributed multi-dimensional arrays. The model can manage the heterogeneity by employing an individual exponential-family distribution for each attribute of the tensor array. These entries are connected by latent variables and are shared information across the different attributes. Because a Bayesian inference for our model is intractable, we cast the EM algorithm approximated by using the Lap lace method and Gaussian process. This approximation enables us to derive a predictive distribution for missing values in a consistent manner. Simulation experiments show that our method outperforms other methods such as PARAFAC and Tucker decomposition in missing-values prediction for cross-national statistics and is also applicable to discover anomalies in heterogeneous office-logging data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信