加权秩一二元矩阵分解的算法及应用。

IF 2.5 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Haibing Lu, X I Chen, Junmin Shi, Jaideep Vaidya, Vijayalakshmi Atluri, Yuan Hong, Wei Huang
{"title":"加权秩一二元矩阵分解的算法及应用。","authors":"Haibing Lu,&nbsp;X I Chen,&nbsp;Junmin Shi,&nbsp;Jaideep Vaidya,&nbsp;Vijayalakshmi Atluri,&nbsp;Yuan Hong,&nbsp;Wei Huang","doi":"10.1145/3386599","DOIUrl":null,"url":null,"abstract":"<p><p>Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.</p>","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":"11 2","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3386599","citationCount":"8","resultStr":"{\"title\":\"Algorithms and Applications to Weighted Rank-one Binary Matrix Factorization.\",\"authors\":\"Haibing Lu,&nbsp;X I Chen,&nbsp;Junmin Shi,&nbsp;Jaideep Vaidya,&nbsp;Vijayalakshmi Atluri,&nbsp;Yuan Hong,&nbsp;Wei Huang\",\"doi\":\"10.1145/3386599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.</p>\",\"PeriodicalId\":45274,\"journal\":{\"name\":\"ACM Transactions on Management Information Systems\",\"volume\":\"11 2\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2020-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/3386599\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Management Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3386599\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Management Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 8

摘要

许多应用程序使用二进制矩阵形式更好地表示的数据,例如点击流数据、市场购物篮数据、文档术语数据、访问控制中的用户权限数据等。矩阵分解方法可以自动从数据向量中提取稀疏而有意义的特征,是高维数据分析中广泛使用的工具。然而,现有的矩阵分解方法不能很好地处理二进制数据。一个关键的限制是可解释性,因为许多矩阵分解方法将输入矩阵分解为具有分数甚至负分量的矩阵,这在许多实际设置中很难解释。一些矩阵分解方法,如二元矩阵分解,将分解矩阵限制为二元值。然而,这些模型在适应某些数据分析任务时并不灵活,比如权衡汇总大小和质量以及区分不同类型的近似误差。为了解决这些问题,本文提出了加权秩一二进制矩阵分解,即通过两个二进制向量的乘积来近似二进制矩阵,参数控制不同类型的近似误差。通过系统地运行加权秩一二进制矩阵分解,可以有效地执行各种二进制数据分析任务,如压缩、聚类和模式发现。研究了加权秩一二元矩阵分解的理论性质,并探讨了其与其他研究领域问题的联系。由于加权秩一二元矩阵分解一般是np困难的,因此提出了高效的分解算法。对加权秩一二元矩阵分解的应用也进行了广泛的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Algorithms and Applications to Weighted Rank-one Binary Matrix Factorization.

Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Management Information Systems
ACM Transactions on Management Information Systems COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
6.30
自引率
20.00%
发文量
60
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信