DeepGRASS: Graph, Sequence and Scaled Embeddings on large scale transactions data

Mahesh Balan Umaithanu, Vignesh Ravichandran, M. Rohith Srinivaas, Venkat Subramanian Selvaraj
{"title":"DeepGRASS: Graph, Sequence and Scaled Embeddings on large scale transactions data","authors":"Mahesh Balan Umaithanu, Vignesh Ravichandran, M. Rohith Srinivaas, Venkat Subramanian Selvaraj","doi":"10.1109/SweDS53855.2021.9638270","DOIUrl":null,"url":null,"abstract":"Representation learning has redefined large scale data mining applications. The high dimensional embeddings learn complex associations that transcend the human cognitive understanding and have achieved great success in different business applications that encounter the curse of dimensionality, including fin-tech. Different algorithms learn embeddings that capture different types of associations, and it would be useful to learn embeddings that holistically learn multi-dimensional associations. In this paper, we propose DeepGRASS – an algorithm that embeds financial transactions using graph and sequence-based topologies. Our results show that these embeddings learn associations that are very comprehensive, holistic, and multi-dimensional.We deploy DeepGRASS in PayPal, and train it on multitude of transaction data with multi-dimensional features. The algorithm is two-fold: it embeds a bipartite graph with customer and merchant nodes and parallelly learns sequential associations using historical transactions along with other transactional features. These embeddings are then scaled and combined to learn multidimensional associations. We tested this on different predictive applications and find that the learning is generic and shows benchmarking performance in different predictive contexts. Based on offline metrics, back-tests, and sensitivity analysis on offline transaction data, we find very strong evidence to suggest that these embeddings provide the highest AUC score in predictive applications, highest co-efficient of determination in explaining variance and the features explain different types of associations. To our knowledge, this is the first application of embeddings that learn both graph and sequence-based associations on large scale financial transaction data and paves the way for a new generation of feature engineering in fin-tech.","PeriodicalId":194514,"journal":{"name":"2021 Swedish Workshop on Data Science (SweDS)","volume":"18 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Swedish Workshop on Data Science (SweDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SweDS53855.2021.9638270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Representation learning has redefined large scale data mining applications. The high dimensional embeddings learn complex associations that transcend the human cognitive understanding and have achieved great success in different business applications that encounter the curse of dimensionality, including fin-tech. Different algorithms learn embeddings that capture different types of associations, and it would be useful to learn embeddings that holistically learn multi-dimensional associations. In this paper, we propose DeepGRASS – an algorithm that embeds financial transactions using graph and sequence-based topologies. Our results show that these embeddings learn associations that are very comprehensive, holistic, and multi-dimensional.We deploy DeepGRASS in PayPal, and train it on multitude of transaction data with multi-dimensional features. The algorithm is two-fold: it embeds a bipartite graph with customer and merchant nodes and parallelly learns sequential associations using historical transactions along with other transactional features. These embeddings are then scaled and combined to learn multidimensional associations. We tested this on different predictive applications and find that the learning is generic and shows benchmarking performance in different predictive contexts. Based on offline metrics, back-tests, and sensitivity analysis on offline transaction data, we find very strong evidence to suggest that these embeddings provide the highest AUC score in predictive applications, highest co-efficient of determination in explaining variance and the features explain different types of associations. To our knowledge, this is the first application of embeddings that learn both graph and sequence-based associations on large scale financial transaction data and paves the way for a new generation of feature engineering in fin-tech.
DeepGRASS:大规模交易数据的图,序列和缩放嵌入
表示学习重新定义了大规模数据挖掘应用。高维嵌入学习超越人类认知理解的复杂关联,并在遇到维度诅咒的不同商业应用中取得了巨大成功,包括金融科技。不同的算法学习捕获不同类型关联的嵌入,学习整体学习多维关联的嵌入将是有用的。在本文中,我们提出了DeepGRASS——一种使用基于图和序列的拓扑结构嵌入金融交易的算法。我们的研究结果表明,这些嵌入学习的关联是非常全面、整体和多维的。我们在PayPal中部署了DeepGRASS,并对具有多维特征的大量交易数据进行了训练。该算法是双重的:它嵌入了一个带有客户和商家节点的二部图,并使用历史交易和其他交易特征并行地学习顺序关联。然后对这些嵌入进行缩放和组合以学习多维关联。我们在不同的预测应用程序上进行了测试,发现学习是通用的,并且在不同的预测环境中显示了基准性能。基于离线度量、回测和对离线交易数据的敏感性分析,我们发现非常有力的证据表明,这些嵌入在预测应用中提供了最高的AUC得分,在解释方差和解释不同类型关联的特征方面提供了最高的确定系数。据我们所知,这是嵌入在大规模金融交易数据上学习基于图和序列的关联的首次应用,并为金融科技领域新一代特征工程铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信