{"title":"金融机构软聚类的在线半nmf算法","authors":"Yuan Cheng, Shawn Mankad","doi":"10.1145/3336499.3338005","DOIUrl":null,"url":null,"abstract":"In this paper we develop and propose an online semi-non-negative matrix factorization framework to cluster firms by their stock returns. The model is motivated by an accounting balance sheet identity, where one of the estimated matrix factors can be seen as the percentage of holdings across different asset classes (stocks, bonds, etc.) for each firm -- an important input for risk analysis. We also show that our model is an extension of soft K-means clustering. To enhance the practical value of the proposed model (OSNMF), we also develop a fast estimation framework that can be readily applied to cluster firms in real-time as new data becomes available. The model is validated using synthetic and real data. Specifically, we apply our technique to recover asset holdings of mutual funds and ETFs from stock returns and show our estimates closely match their disclosed balance sheets.","PeriodicalId":148424,"journal":{"name":"Proceedings of the 5th Workshop on Data Science for Macro-modeling with Financial and Economic Datasets","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Online Semi-NMF Algorithm for Soft-Clustering of Financial Institutions\",\"authors\":\"Yuan Cheng, Shawn Mankad\",\"doi\":\"10.1145/3336499.3338005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we develop and propose an online semi-non-negative matrix factorization framework to cluster firms by their stock returns. The model is motivated by an accounting balance sheet identity, where one of the estimated matrix factors can be seen as the percentage of holdings across different asset classes (stocks, bonds, etc.) for each firm -- an important input for risk analysis. We also show that our model is an extension of soft K-means clustering. To enhance the practical value of the proposed model (OSNMF), we also develop a fast estimation framework that can be readily applied to cluster firms in real-time as new data becomes available. The model is validated using synthetic and real data. Specifically, we apply our technique to recover asset holdings of mutual funds and ETFs from stock returns and show our estimates closely match their disclosed balance sheets.\",\"PeriodicalId\":148424,\"journal\":{\"name\":\"Proceedings of the 5th Workshop on Data Science for Macro-modeling with Financial and Economic Datasets\",\"volume\":\"142 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th Workshop on Data Science for Macro-modeling with Financial and Economic Datasets\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3336499.3338005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th Workshop on Data Science for Macro-modeling with Financial and Economic Datasets","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3336499.3338005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Online Semi-NMF Algorithm for Soft-Clustering of Financial Institutions
In this paper we develop and propose an online semi-non-negative matrix factorization framework to cluster firms by their stock returns. The model is motivated by an accounting balance sheet identity, where one of the estimated matrix factors can be seen as the percentage of holdings across different asset classes (stocks, bonds, etc.) for each firm -- an important input for risk analysis. We also show that our model is an extension of soft K-means clustering. To enhance the practical value of the proposed model (OSNMF), we also develop a fast estimation framework that can be readily applied to cluster firms in real-time as new data becomes available. The model is validated using synthetic and real data. Specifically, we apply our technique to recover asset holdings of mutual funds and ETFs from stock returns and show our estimates closely match their disclosed balance sheets.